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CHAPTER 1 


Principles of Analyzing algorithms and Problems 

An algorithm is a finite set of computational instructions, each instruction can be executed in finite 
time, to perform computation or problem solving by giving some value, or set of values as input to 
produce some value, or set of values as output. Algorithms are not dependent on a particular 
machine, programming language or compilers i.e. algorithms run in same manner everywhere. So 
the algorithm is a mathematical object where the algorithms are assumed to be run under machine 
with unlimited capacity. 

Examples of problems 

• You are given two numbers, how do you find the Greatest Common Divisor. 

• Given an array of numbers, how do you sort them? 

We need algorithms to understand the basic concepts of the Computer Science, programming. 
Where the computations are done and to understand the input output relation of the problem we 
must be able to understand the steps involved in getting output(s) from the given input(s). 

You need designing concepts of the algorithms because if you only study the algorithms then you 
are bound to those algorithms and selection among the available algorithms. However if you have 
knowledge about design then you can attempt to improve the performance using different design 
principles. 

The analysis of the algorithms gives a good insight of the algorithms under study. Analysis of 
algorithms tries to answer few questions like; is the algorithm correct? i.e. the 
Algorithm generates the required result or not?, does the algorithm terminate for all the inputs 
under problem domain? The other issues of analysis are efficiency, optimality, etc. So knowing the 
different aspects of different algorithms on the similar problem domain we can choose the better 
algorithm for our need. This can be done by knowing the resources needed for the algorithm for its 
execution. Two most important resources are the time and the space. Both of the resources are 
measures in terms of complexity for time instead of absolute time we consider growth 

Algorithms Properties 

• Input(s)/output(s): There must be some inputs from the standard set of inputs and an 
algorithm’s execution must produce outputs(s). 

• Definiteness: Each step must be clear and unambiguous. 

• Finiteness: Algorithms must terminate after finite time or steps. 

• Correctness: Correct set of output values must be produced from the each set of inputs. 

• Effectiveness: Each step must be carried out in finite time. 

Here we deal with correctness and finiteness. 
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Random Access Machine Model (RAM) 

This RAM model is the base model for our study of design and analysis of algorithms to have 
design and analysis in machine independent scenario. In this model each basic operations (+, -) 
takes 1 step, loops and subroutines are not basic operations. Each memory reference is 1 step. We 
measure run time of algorithm by counting the steps. 


Algorithms: 

> Designing of algorithms 

> Analysis of algorithms 

> Validation of algorithms <■ 

> Testing of algorithms < 


Study in details 

Few study 
No study 


Best, Worst and Average case 

The least possible execution time taken by an algorithm for a particular input is known as 
best case. Best case complexity gives lower bound on the running time of the algorithm for 
any instance of input. This indicates that the algorithm can never have lower running time 
than best case for particular class of problems. 

Worst case complexity: The maximum possible execution time taken by an algorithm for 
a particular input is known as worst case. It gives upper bound on the running time of the 
algorithm for all the instances of the input. This insures that no input can overcome the 
running time limit posed by worst case complexity. 


Average case complexity: It gives average number of steps required on any instance of the 
input. 

Example: - let’s take an algorithm for Quick sort 

QuickSort(A,l,r) 

{ 


if(l<r) 

{ 


p = Partition(A,l,r); 
QuickSort( A,l,p-1); 
QuickSort( A,p+1 ,r); 


} 

Partition(A,l,r) 
{ 


x =1; y =r ; p = A[lj; 
while(x<y) 

{ 

while(A[x] <= p) 

{ 


x++; 

} 

while(A[y] >=p) 

{ 
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if( x <y) 

swap(A[x],A[y]); 

} 

A[l] = A[y];A[y]=p; 
return y; //return position of pivot 

} 

Best Case Time Complexity: 

Divides the array into two partitions of equal size, therefore 
T(n) =2T(n/2) + O(n), Solving this recurrence we get, 

■=> Time Complexity = 0(n log n) 

Worst Case Time Complexity: 

When array is already sorted or sorted in reverse order, one partition contains n-1 items and 
another contains zero items, therefore its recurrence relation is, 

T(n) = T(n-l) + 0(1), Solving this recurrence we get 
■=> Time Complexity = 0(n 2 ) 

Average Case Time Complexity; 

Average case occurred when the elements are divided into ratio 9:1 
Then the recurrence relation for this case is, 

T (n) = T (9n/10) + T (n/10) + O (n), by solving this recurrence we get 
Time Complexity = O (n log n) 

, _ £> 

X 

* NOTE: - In our study we concentrate on worst case complexity only. 


Example 1: Detailed analysis of Bubble sort: 

Bubble_Sort(A, n) 

{ 

for (i=l;i<=n;i++) 

{ 

for (j=0; j<n-i; j++) 

{ 

if(A[jJ>A[j+l J) 

{ 

temp=A[j]; 

A[j]=A[j+l]; 

A[j+l]=temp; 

} 

} 

} 

} 


Analysis: 

Space complexity: 

s.c=l+l+l+l+n=n+4=0 (n) 
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Time complexity: 

Within first for loop: 

i=l takes 1 step.-> 1 

i<=n takes (n+1) steps-> (n+1) 

i++ takes n steps-> n 

Within second for loop: 

j=0 takes n step-> 1 

j<n-i takes [n + (n-1) + (n-2) H-+ 2+1] 

j++ takes [(n-1) + (n-2) + -.+ 2 + 1] 

In if statement: 

It takes at most 3* (n-1) 

So total time complexity (T. C) = 

T.c= 1+(n+1 )+n+ [n+ [n+(n-1 )+(n-2)+.+3+2+1]+ [(n-l)+(n-2) + 

.. .+3+2+l]]+3*[(n-l)+(n-2) +.+3+2+1] 

=2n+2+ [n + n (n+l)/2+n (n-l)/2] + 3*n (n-l)/2 
=2n+2+n+n 2 /2+n/2+ n 2 /2-n/2+2n 2 /2-3n/2 
= (5n 2 +5n)/2 
=0 (n 2 ) 


In brief: 

Outer for loop executes at most n times 
In ner for loop executes at most n times 
Since both are nested loop thus time complexity is, 

T (n) = n*n=0(n 2 ) 

Example 2: Analysis of nth Fibonacci number generating algorithm: 

Input: n 

Output: n th Fibonacci number. 

Algorithm: assume a as first (previous) and b as second (current) numbers 
Fib (n) 

{ 

a = 0, b= 1, f=l; 
for(i = 2 ; i <=n ; i++) 

{ 

f = a+b; 

a=b; 

b=f; 

} 

return f; 

} 

Efficiency 

Time Complexity: The algorithm above iterates up to n-2 times, so time complexity is 0(n). 
Space Complexity: The space complexity is constant i.e. 0(1). 
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Mathematical Foundation 


Since mathematics can provide clear view of an algorithm. Understanding the concepts of 
mathematics is aid in the design and analysis of good algorithms. Here we present some of the 
mathematical concepts that are helpful in our study. 

Exponents 

Some of the formulas that are helpful are: 

• x a x b = x a+b 

• x a /x b = x a - b 

• (x a ) b = x ab 

• x n + x n = 2x n 


2 n + 2 n = 2 


n+l 


Logarithms 

Some of the formulas that are helpful are: 

1. log a b = log c b / log c a ; c>0 [ making base same] 

2. log ab = log a + log b 

3. log a/b = log a - log b 

4. log (a b ) = b log a 

5. Log x < x for all x>0 

6. Log 1 = 0, log 2 = 1, log 1024 = 10. 

7. a logb n = n logb a 


Series 


. n(n +1) 

L l = -r- 


i=l 




n(n +1)(2/7 +1) 


i=i 


V a' < - ;if0 < a < 1 

i=o 1 - a 

^2' = 2" +1 -1 


i=0 


Asymptotic Notation 

Complexity analysis of an algorithm is very hard if we try to analyze exact. We know that the 
complexity (worst, best, or average) of an algorithm is the mathematical function of the size of the 
input. So if we analyze the algorithm in terms of bound (upper and lower) then it would be easier. 
For this purpose we need the concept of asymptotic notations. The figure below gives upper and 
lower bound concept. 

Big Oh (O) notation 

When we have only asymptotic upper bound then we use O notation. 

If f and g are any two functions from set of integers to set of integers then function f(x) is said to 
be big oh of g(x) i.e.f(x)=0(g(x))) iff there exists two positive constants c and x 0 such that 
for all x >= xo, 0 <= f(x) <= c*g(x) 

The above relation says that g(x) is an upper bound of f(x) 
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Some properties: 

Transitivity: f(x) = 0(g(x)) & g(x) = 0(h(x)) then f(x) = 0(h(x)) 

Reflexivity: f(x) = O (f(x)) 

O (1) is used to denote constants. 

For all values of n >= no, plot shows clearly that f(n) lies below or on the curve of c*g(n) 



Examples 

1. f(n) = 3n 2 + 4n + 7 

<= 3n 2 + 4n 2 + 7n 2 <=14n 2 
■=> f(n)<=14n 2 

where c=14 and g(n) = n 2 , thus f(n) = 0(g(n)) = 0(n 2 ) 

2. Prove that n log (n 3 ) is 0(Vn 3 )). 

Proof: we have n log (n 3 ) = 3n log n 
Again, Vn 3 = n Vn, 

If we can prove log n = O(vn) then problem is solved 
Because n log n = n O(Vn) that gives the question again. 

We can remember the fact that log a n is O (n b ) for all a,b>0. 

In our problem a = 1 and b = 1/2, 
hence log n = O(vn). 

So by knowing log n = O(Vn) we proved that 
n log (n 3 ) = 0(Vn 3 )). 

Big Omega (Q) notation 

Big omega notation gives asymptotic lower bound. 

If f and g are any two functions from set of integers to set of integers then function f(x) is said to 

be big omega of g(x) i.e.f(x)= Q(g(x)) iff there exists two positive constants c and xo such that 
for all x >= xo, 0 <= c*g(x) <= f(x). 

The above relation says that g(x) is a lower bound of f(x). 
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For all values of n >= no. plot shows clearly that fi n) lies above or on the curve of c*g(n). 


Examples 

1. f(n) = 3n 2 + 4n + 7 

g(n) = n 2 , then prove that f(n) =Q(g(n)). 

Proof: let us choose c and no values as 3 and 1, respectively then we can have 

f(n) >= c*g(n), n>=n 0 as 

3n 2 + 4n + 7 >= 3*n 2 for all n >= 1 

The above inequality is trivially true 

Hence f (n) =Q (g(n)) 


Big Theta (Q) notation 

When we need asymptotically tight bound then we use notation. 

If f and g are any two functions from set of integers to set of integers then function f(x) is said to 

be big theta of g(x) i.e. f(x)= 0(g(x)) iff there exists three positive constants ci, C 2 and xo such that 
for all x >= xo, ci*g(x) <= f(x) <= C2*g(x) 

The above relation says that f(x) is order of g(x) 

Some properties: 

Transitivity: f(x) = 0(g(x)) & g(x) = 0(h(x)) then f(x) = 0(h(x)) 

Reflexivity: f(x) = 0(f(x)) 

Symmetry: f(x) = 0(g(x)) iff g(x) = 0(f(x)) 
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f(n) = 0(g(n)) 

For all values of n >= no, plot shows clearly that f(n) lies between ci* g(n)and C2*g(n). 


Example 

1. f(n) = 3n 2 + 4n + 7 

g(n) = n 2 , then prove that f(n) = (g(n)). 

Proof: let us choose ci, c 2 and no values as 14, 1 and 1 respectively then we can have, 
f(n) <= ci*g(n), n>=no as 3n 2 + 4n + 7 <= 14*n 2 , and 
f(n) >= C 2 *g(n), n>=no as 3n 2 + 4n + 7 >= l*n 2 
for all n >= l(in both cases). 

So ci*g(n) <= f(n) <= ci*g(n) is trivial. 

Hence f(n) = 0 (g(n)). 


Recurrences 


• To analyze the Recursive algorithms we must need to find their recurrence relations. 

• A recurrence relation is an inequality that describes a problem in terms of itself. 

For Example: 

Recursive algorithm for finding factorial 


T(n)=l when n =1 

T(n)=T(n-l) + 0(1) when n>l 

Recursive algorithm for finding Nth Fibonacci number 
T(l)=l when n=l 

T(2)=l when n=2 

T(n)=T(n-l) + T(n-2) +0(1) when n>2 

Recursive algorithm for binary search 

T(l)=l when n=l 

T(n)=T(n/2) + 0(1) when n>l 


Cost of solving recursive algorithm- cost of dividing problem + cost of solving 
sub problems + cost of merging solutions 
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Techniques for Solving Recurrences 

We’ll use four techniques: 

• Iteration method 

• Recursion Tree 

• Substitution 

• Master Method - for divide & conquer 

• Characteristic Equation [not needed to study] - for linear 

Iteration method 

Here we expand the given relation until the boundary is not meet. 

Expand the relation so that summation independent on n is obtained. 
Examplel: T(n)= 2T(n/2) +1 when n>l 

T(n)= 1 when n=l 

Sol 11 : T(n) = 2T(n/2) +1 

= 2 {2T (n/4) + 1} +1 
= 2 2 T (n/2 2 ) + 2+1 
= 2 2 (2T(n/2 3 ) +1 } + 2 + 1 
= 2 3 T(n/2 3 ) + 2 2 + 2 + 1 


= 2 k T( n/2 k ) + 2 k_1 +.+4 + 2+1. 

For simplicity assume: 
n/2 k =l 
or, n=2 k 

Taking log on both sides, 
log n=log 2 k 
log n=k log 2 

■=> k=log n [since log 2=1] 


Now, T(n)= 2 k T( n/2 k ) + 2 k_1 +.+4 + 2 + 1. 

O T(n)=2 k T(l)+2 k ' 1 +.+ 2 2 + 2 1 + 2° 

O T(n)= (2 k+1 -1)/ (2-1) 

O T(n)= 2 k+I -1 
=> T(n)= 2.2 k -1 
O T(n)= 2n-l 
O T(n)= O(n) 


Example 2: 

T(n) = T(n/3) + O(n) when n>l 

T(n) = 1 when n=l 

sol" T(n) = T(n/3) + O(n) 

~ T(n) = T(n/3) + cn 

■=> T(n) = T(n/3 2 ) + cn/3 + cn 

■=> T(n) = T(n/3 3 ) + cn/3 2 + cn/3 + cn 

■=> T(n) = T(n/3 4 ) + cn/3 3 +cn/3 2 + cn/3 + cn 

■=> T(n)= T(n/3 k ) +c n/3 k 1 +.+ cn/3 2 + cn/3 + cn 

Simplicity assume 
n/3 k =l 
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or, n= 3 k 

taking log on both sides, 
log n=log 3 k 
■=> k = log 3 n 

■=> T(n)<= T(l) + cn/3 k l .+ c.n/3 2 + c.n/3 + c.n 

■=> T(n) <= 1 +{ c.n/3 k 1 .+ c.n/3 2 + c.n/3 + c.n} 

O T(n)<= 1+c.n { l/( 1-1/3) } 

■=> T(n) <= 1+ 3/2 c.n 
=> T(n) = O(n) 

Example 3: T(n)= T(n-l) +0(1) 

=T(n-2) +1 +1 [ since 0(1)=1 where choose c=l] 

=T(n-3)+l+l+l 
= T(n-4)+l+l+l+l 


= T(n-k)+l+.+1 (k times) 

Lets choose n-k=l 
■=> k=n-l 

Now T(n)=T(l)+l+.+ 1 (k times) 

=l+k*l 
=k+l 
=n-l + l=n 
O T(n)=0(n) 

Example 4: T(n)= 2T(n/2) + n 

=2[2T(n/2 2 )+ n/2] +n 
=2 2 T(n/2 2 )+ n +n 
=2 2 [2T(n/2 3 )+ n/2 2 ]+n + n 
=2 3 T(n/2 3 )+ n +n+ n 


=2 k T(n/2 k )+ n +n +.+n (k times ) 

Let’s put n/2 k =l 

■=> n=2 k 

Taking log on both sides, 
log n=log 2 k 
or, log n=k log 2 
<=> k=log n 

Now, T(n)=n T(l) +n +n +.+n (k times) 

=n+ k*n 
= (k n + n) 

=log n*n +n 
=n log n +n 
■=> 0(n log n) 
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Example 5: T(n)= T(n/3) + n 

=T(n/3 2 ) +n/3 + n 
= T(n/3 3 ) +n/3 2 + n/3 +n 
= T(n/3 4 ) + n/3 3 + n/3 2 + n/3 + n 


= T(n/3 k ) + n/3 k_1 +.+ n/3 2 + n/3 + n 

Lets put n/3 k = 1 

■=> 3 k = n 

Taking log on both sides 
■=> log 3 k = log n 
■=> k log 3 =log n 
■=> k= log n/log 3 

Now T(n)= T(l) + n/3 k_1 +.+ n/3 2 + n/3 1 + n/3° 

= 1 + n [1/3° +1/3 1 + 1/3 2 + 1/3 3 +.+ l/3 k ' 2 + l/3 k l ] 

Since this is a geometric series of common ration r =1/3 

Thus T(n)= 1 + n[l-(l/3) k ] / [1-1/3] 

=1 + n [l-l/3 k ] / 2/3 
=1 +3n/2 [1-1/n] 

=1 + 3n/2 * (n-l)/n 
=1 + 3n/2 - 3/2 
=3n/2 -1/2 
= 0(n) 

Hence T(n)= O (n) 


Cn 


Sum of Geometric series: 

a(r n - 1 ) . 


* L, 


* s„ = 


r -1 

a(\-r n ) 
1 -r 


if r >1 

if r < 1 


w 


Recursion Tree 

Just Simplification of Iteration method: 

This is a pictorially representation of iteration method. 

Example 1:- Consider the recurrence 

T(l)=l when n=l 

T(n)= T(n/2)+ 1 when n>l 
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Cost at each level =1 
For simplicity assume that n/ 2 =1 
O n=2 K 

Taking log on both sides, 

■=> log n=log2 K 
■=> k log 2=log n 
■=> k= log n 

Summing the cost at each level, 

Total cost = 1 + 1 + 1 +.+T (n/2 K ) 

=1+1+.+1 (k times) +T (1) 

=k*l+l 
= (k+1) 

= log n+1 

O T(n)=0(logn) 


Example 2:- 

T (1) = 1 when n=l 

T (n) = T (n-1) + 1 when n>l 
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T. C. = T (n) = 1 + 1 +.+ 1 (k times) + T(n-k) 

= k * 1 + T(n-k) 

Let s put n-k=l 

O k= n-1 
Now T(n)= k + T(l) 

= n-1+1 = n 
=0 (n) 

O T (n)= 0 (n) 


T(n-k). 1 


Example 3:- 

T (1) = 1 

T (n) = 2 T (nil) + 1 

Soln:- 


when n=l 
when n>l 
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Now T(n)= 2 °+ 2 '+ 2 2 +.+ 2 k 

= 1+2 [2 k -l] / 2-1 
= 1+ 2(2 k -l) 

= 2 * 2 k -1 

For simplicity assume that n/ 2 =1 
O n=2 K 

Taking log on both sides, 

■=> log n=log2 K 
■=> k log 2=log n 
■=> k= log n 

Now T (n) =2n - 1 
Hence, T (n) = O (n) 

Example 4:- 

T (1) = 1 when n=l 

T (n) = 2 T (n/2) + n when n>l 

Soln:- 



Now T(n)= n + n + n+.+ n (k times ) + T(n/2 k ) 

= n* k + T(n/2 k ) 

For simplicity assume that n/ 2 =1 
O n=2 K 

Taking log on both sides, 

■=> log n=log2 K 
■=> k log 2=log n 
■=> k= log n 
Now T (n) =kn + T(l) 

=n log n +1 Hence, T (n) = O (n log n) 
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Example 5:- 

T (1) = 1 when n=l 

T (n) = T (n/2) + T (n/3) + O (1) when n>l 


Soln:- 




T(n/2) T(n/3) 




T(n/2 k ) 


2 k 


Now total cost, T(n) < 2 °+ 2 '+ 2 2 +.+ 2 k 

= 1 + [2 (2 k -l)/ (2-1)] 

= 1 + 2*2 k - 2 
= 2* 2 k - 1 
Lets put n/2 k =l 
O n=2 K 


Taking log on both sides, 

■=> log n=log2 K 
■=> k log 2=log n 
■=> k= log n 
Now T (n) < 2*n - 1 

< 2n-1 Hence, T (n) = O (n) 
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Example 6 
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(n/4) 2 (n/2 f 


T(n/16) T(n/8) T(n/8) T(n/4) 




T(n/64) 

• \ 

• \ 

• \ 


T(n/32) T(n/32)T(n/16)T(n/32)T(n/16)T(n/16) T(n/8) < 




5 2 n 2 /16 2 

5 3 n 2 /16 3 


',<■. <= 5 k n716 k 

T(n/2 k ) 


Total Cost < n 2 + 5 n 2 /16 + 5 2 n 2 /16 2 +5 3 n 2 /16 3 +.+5 k n 2 /16 k 

{Why <? Why not =?} 

< n 2 (1+ 5/16 + 5 2 /16 2 + 5 3 /16 3 +.+ 5 k /16 k ) 

< n 2 [1+ (5/16 + 5 2 /16 2 + 5 3 /16 3 +.+ 5 k /16 k )] 

< n 2 + [1 +(l-5 k /16 k ) / (1-5/16)] 

< n 2 + [1 + 11( l-5 k /16 k ) / 16] 

Let’s put n/2 k =l 

O n=2 k 


taling log on both sides 
log n= log 2 k 
■=> log n= k log 2 
■=> k=log n [since log 2=1] 

Now T(n) < n 2 + [1 + 11( l-5 logn /16 logn ) / 16] 

< n 2 + constant < O (n 2 ) Hence T (n) = O (n 2 ) 
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Substitution Method 
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In this method to obtain the upper bound (worst case time complexity) of the recurrence relation 
we must use following two steps: 

1. At first guess the solution, 

2. Then verify the solution by using mathematical induction. 


Note: Initially guessing the solution of a problem depends upon your practices. 


Example 1: 

T(n) = 1 n=l 

T(n) = 4T(n/2) + n n>l 

Soln: Guess T(n) = 0(n 3 ) 

=> T(n) < cn 3 , for all n >= no.(1) 

Now prove this by mathematical induction as, 


Base step. 

For n=l: 

T(n) = c*l 3 Definition 

1 < c which is true for all +ve values of c 

Inductive step, 

Lets assume that it is true V k < n 

Then T(k) < ck 3 .(2) 


It is also true for k=n/2 
Now equation 2 becomes, 

T(n/2) < c (n/2) 3 
= c n 3 /8 

Now, T(n) = 4T(n/2) + n 
< 4 c n 3 /8 + n 
= c n 3 / 2 + n 
= c n 3 - c n 3 /2 +n 
= c n 3 - n (c n 2 /2 - 1) < c n 3 
Hence T (n) < c n 3 for Vn>0 
Thus T(n) = 0(n 3 ) Proved 


Example 2 show that 0(n) is its solution 


T(n) =1 n=l 

T(n) = 4T(n/2) + n n>l 

Soln: Guess: T(n) = 0(n 2 ). 

T(n) < cn 2 for V n > nO.(1) 


Now proof this relation by using mathematical induction 
Base step. 

For n=l, 

T(n) = c*l 2 Definition 
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1 < c which is true for all +ve values of c 


Inductive step, 

Lets assume that it is true V k < n 

Then T(k) < ck 2 .(2) 

It is also true for k=n/2 
Now equation 2 becomes, 

T(n/2) < c (n/2) 2 
= c n 2 /4 

Now, T(n) = 4T(n/2) + n 
< 4 c n 2 /4 + n 
= c n 2 + n 
=> T(n) = c n 2 + n 

It is not possible to show that c n"+ n < c n V n>0, thus we try to subtract lower order term as, 
Since T(n) = 0(n 2 ) 

=> T(n) < c n 2 - dn [since c n 2 - dn < c n 2 ].(3) 

Where c and d are +ve constants 

Now proof this relation by using mathematical induction, 

Base step. 

For n=l, 

T(n) = c * 1 2 - d*1 Definition 

1 < c - d which is true for all +ve values of c and d < c 


Inductive step, 

Lets assume that it is true V k < n 

Then T(k) < ck 2 - d k.(4) 

It is also true for k=n/2 
Now equation 4 becomes, 

T(n/2) < c (n/2) 2 - d n/2 
= c n 2 /4 - d n/2 
Now, T(n) = 4T(n/2) + n 

< 4[c n 2 /4 - d n/2] + n 

< c n 2 - 2d n] + n 

2 

<cn-dn-dn+n 

< (c n 2 - d n) - n (d- 1) < ( c n 2 - d n) 

=> T(n) < ( c n 2 - d n) V n>0 

Thus T(n) = 0(n 2 ) Proved 


Ability to guess effectively comes with experience. 


Example 3 show that 0(n ) is its solution 

T(n)= 8 T(n/2) + n 2 by using substitution method 

Sol 11 :- Guess: T(n) = 0(n 3 ) 

T(n) < cn 3 for V n > nO.(1) 
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Now proof this relation by using mathematical induction 
Base step. 

For n=l, 

T(n) = c*l 3 Definition 

1 < c which is true for all +ve values of c 

Inductive step, 

Lets assume that it is true V k < n 

Then T(k) < ck 3 .(2) 

It is also true for k=n/2 
Now equation 2 becomes, 

T(n/2) < c (n/2) 3 
= c n 3 /8 

Now, T(n) = 8T(n/2) + n 2 

< 8 c n 3 /8 + n 2 

3 2 

= c n + n 
=> T(n) = c n 3 + n 2 

3 2 3 

It is not possible to show that cn+n <cn V n>0, thus we try to subtract lower order term as, 
Since T(n) = 0(n 2 ) 

=> T(n) < c n 3 - dn 2 [since c n 3 - dn 2 < c n 3 ].(3) 

Where c and d are +ve constants 

Now proof this relation by using mathematical induction, 

Base step. 

For n=l, 

T(n) = c* 1 2 - d* 1 Definition 

1 < c - d which is true for all +ve values of c and d < c 

Inductive step, 

Lets assume that it is true V k < n 

Then T(k) < ck 3 - d k 2 .(4) 

It is also true for k=n/2 
Now equation 4 becomes, 

T(n/2) < c (n/2) 3 - d( n/2) 2 
= c n 3 /8 - d n 2 /4 
Now, T(n) = 8T(n/2) + n 2 

< 8[c n 3 /8 - d n 2 /4] + n 2 

< c n 3 - 2d n 2 + n 2 

< c n 3 - d n 2 - d n 2 + n 

< (c n 3 - d n 2 ) - n (d n - 1) < ( c n 3 - d n 2 ) 

=> T(n) < ( c n 3 - d n 2 ) V n>0 

Thus T(n) = 0(n 3 ) Proved 
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Changing Variables: 

Sometimes a little algebraic manipulation can make an unknown recurrence similar to one we have 
seen. 

Consider the example 

T (n) = 2T ( L n 1/2 J ) + log n 
Looks Difficult: Rearrange like 
Let m=log n 
=> n= 2 m 
Thus, 

T (2 m ) = 2T (2 m/2 ) + m 
Again let S (m) = T (2 m ) 

S (m) = 2S (m/2) + m 
We can show that 
S (m) = O (m log m) 

■=> T(n) = T(2 m ) =S(m) = O (m log m) = 0(log n log logn) 


Master Method 

The master method is used to solve the recurrence relation of the form, 

T (n) = a T (n/b) + f (n) 

where a>l, b>l are constant, f(n) asymptotically positive function 
If the recurrence relations is in this form then there are following four possible cases occurred: 

Master Method Case 1 

If f (n) = O (n'° 9 b a ' E ) for some constants e>0 
Then 

T(n) = 0(n lo V) 


Master Method Case 2 

If f(n) = Q(n'° 9 b a +£ ) for some constants s>0 
Then 

T(n) = 0( f(n)) 


Master Method Case 3 

If f(n) = 0 (n'° 9 b a ) for some constants e>0 

Then 

T(n) = 0(f(n) . log n ) 

In the above three cases we are comparing the values of f(n) and n l09 b a and then find complexity 
of the given recurrence relation. 


Master Method Case 4 I n this case the master method cannot be applied 
Example: - T (n) = 4 T (n/2) + n 2 / log n 


Soln: - Comparing this relation with the general form of master relation T (n) = a T (n/b) + f (n) 
Where a = 4, b = 2 and f (n) = n 2 / log n 
Now, n l09 b a = n'° 9 2 4 = n 2109 2 2 =n 2 
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Test case 1:- f (n) = O (n l09 b a_£ ) 

=> f (n) < n'° 9 b a ' £ 

Or, f(n) < n 2 ' £ 



’ lo §" n £ 


~ n n 

Or, -< —— where we choose 8 = 0.1 

log n n 0.1 

=> n2 n 01 < n 2 log n 

To satisfy this relation the value of log n must be greater than n 0 ' 1 , 
But n° 1 is a polynomial in 0.1 thus n° 1 must be greater than log n 
ie. n 01 > log n 

Thus master method failed in this case 


Test case 2:- f(n) = O(n l09 b a +£ ) 
=>f(n)> n'° 9 b a+s 
.2 


Or, 


Or, 


n 


log n 
log n 


> n 


2 + £ 


> n 


2 + 0.1 


2 2 0 1 

Or, n‘ > n n log n which is false 

Test case 3:- f(n) = 0 (n'° 9 b a ) 

=> f(n) = n'° 9 b a 


log n 

2 2 

Or, n" = n log n which is false 

Since master method is false in all three cases thus, in this recurrence relation master method can 
not be applied. 


Example 1 Solve the following recurrence relation by using Master’s method 
T(n)= 3 T(n/2) + n 

sol 11 : Here we have a=3, b=2 and f(n) = n 

t. T k-J° 9 a |_ log 3 (log 23/log 22) 1.584 

Now, n a b = n 2 = n s '=n 
Also f(n) = n 1 

Since f(n) < n l09 b a c where choose S = 0.1 
Thus it satisfy the first case of Master’s method 
Thus it’s complexity, 

T(n)=0 (n l09 b a ) = 0 (n l09 2 3 ) = 0 (n 158 ) 

Thus T(n)=0 (n 158 ) 
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Example 2 Solve the following recurrence relation by using Master’s method 
T(n) = 4 T(n/2) + n 2 

sol 11 : Here we have a=4, b=2 and f(n) = n 2 

tl T _ log a log 4 2 log 2 2 

Now, n a b = n 2 =n 2 = n 

Also f(n) = n 2 

Since f(n) = n l09 b 3 

Thus it satisfy the third case of Master’s method 
Thus it’s complexity, 

T(n)= 0 ( f(n) log n) = 0 ( n 2 log n) 

Thus T(n) = 0 ( n 2 log n) 

Example 3 Solve the following recurrence relation by using Master’s method 
T(n) = 9 T(n/3) + n 

sol 11 : Here we have a=9, b=3 and f(n) = n 

T .Jog a .Jog 9 2 log 3 2 

Now, n a b = n 3 = n 3 = n 

Also f(n) = n 

Since f(n) < n l09 b 3 c where choose S = 0.1 
Thus it satisfy the first case of Master’s method 
Thus it’s complexity, 

T(n)= 0 (n l0 V) = © (n l09 3 9 ) = 0 (n 2 ) 

Thus T(n)= 0 (n 2 ) 

Example 4 Solve the following recurrence relation by using Master’s method 
T(n) = 3 T(n/4) + n log n 
sol 11 : Here we have a=3, b=4 and f(n) = n log n 
Now, n a b = n 4 = n 
Also fin) = n log n 

Since fin) > n l09 b 3 + c where choose 8 = 0.1 
Thus it satisfy the second case of Master’s method 
Thus it’s complexity, 

T(n)= 0 ( fin) ) = 0 ( n log n) 

Thus T(n)= 0 ( n log n) 

Example 5 Solve the following recurrence relation by using Master’s method 
T(n) = 2 T(n/4) + V n 

sol 11 : Here we have a=2, b=4 and fin) = V n = n 1/2 = n°' 5 
Now, n b = n 4 = n 
Also fin) = n°' 5 
Since fin) = n l09 b 3 

Thus it satisfy the third case of Master’s method 
Thus it’s complexity, 

T(n)= 0 ( fin) log n) = 0 ( n°' 5 log n) 

Thus T(n) = 0 ( n°' 5 log n) 
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Example 6 Solve the following recurrence relation by using Master’s method 
T(n) = 2 T(2n/3) + 1 

sol 11 : At first convert this relation into Master’s fonn as, 

T(n) = 2 T(n/3/2) + n° 

=> T(n) = 2 T(n/1.5) + n° 

SOLVE ITSELF. 


We get solution T(n) = 0 ( n 1 ' 854 ) 


Exercises 

> Show that the solution of T(n) = 2T( n / 2 ) + n is Q(n log n). Conclude that solution is 0 (n log n). 

> Show that the solution to T(n) = 2T(/z / 2 + 17) + n is 0(n log n). 

> Write recursive Fibonacci number algorithm derive recurrence relation for it and solve by 
substitution method. {Guess 2 n } 

> Argue that the solution to the recurrence T(n) = T(n/3) + T(2n/3) + n is (n log n) by 
appealing to a recursion tree. 

> Use iteration to solve the recurrence T(n) = T(n-a) + T(a) + n, where a >=1 is a constant. 

> The running time of an algorithm A is described by the recurrence T(n) = 7T(n/2) + n . A 
competing algorithm A’ has a running time of T’(n) = aT’(n/4) + n2. What is the largest 
integer value for ‘a’ such that A’ is asymptotically faster than A? 




CHAPTER 2 


Review of Data Structures 


This part is to introduce some of the data structures if you want rigorous study you can consult the 
book on Data Structures. 


Simple Data structures 

The basic structure to represent unit value types are bits, integers, floating numbers, etc. The 
collection of values of basic types can be represented by arrays, structure, etc. The access of the 
values are done in constant time for these kind of data structured 


Linear Data Structures 

A data structure is called linear if every item is related with next and previous items. In another 
words the data structure in which items are arranged in sequence manner is called linear data 
structure. Examples of linear data structures are: array, linked list, stack, queue etc. Linear data 
structures are widely used data structures we quickly go through the following linear data 
structures. 


Lists 

List is the simplest general-purpose data structure. They are of different variety. Most fundamental 
representation of a list is through an array representation. The other representation includes linked 
list. There are also varieties of representations for lists as linked list like singly linked, doubly 
linked, circular, etc. There is a mechanism to point to the first element. For this some pointer is 
used. To traverse there is a mechanism of pointing the next (also previous in doubly linked). Lists 
require linear space to collect and store the elements where linearity is proportional to the number 
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of items. For e.g. to store n items in an array nd space is required were d is size of data. Singly 
linked list takes n(d + p), where p is size of pointer. Similarly for doubly linked list space 
requirement is n(d + 2p). 

Array representation 

•S Operations require simple implementations. 

•S Insert, delete, and search, require linear time, search can take O(logn) if 
binary search is used. To use the binary search array must be sorted. 

■S Inefficient use of space 

Singly linked representation (unordered) 

1. Insert and delete can be done in 0(1) time if the pointer to the node is given, otherwise 0(n) time. 

2. Search and traversing can be done in 0(n) time 

3. Memory overhead, but allocated only to entries that are present. 

Doubly linked representation 

4. Insert and delete can be done in 0(1) time if the pointer to the node is given, otherwise 0(n) time. 

5. Search and traversing can be done in 0(n) time 

6. Memory overhead, but allocated only to entries that are present, search becomes easy. 

boolean isEmpty (); 

Return true if and only if this list is empty. 

• int size (); 

Return this list’s length. 

• boolean get (int i); 

Return the element with index i in this list. 

• boolean equals (List a, List b); 

Return true if and only if two list have the same length, and each element of the lists are equal 

• void clear (); 

Make this list empty. 

• void set (int i, int elem); 

Replace by elem the element at index i in this list. 

• void add (int i, int elem); 

Add elem as the element with index i in this list. 

• void add (int elem); 

Add elem after the last element of this list. 

• void addAll (List a List b); 

Add all the elements of list b after the last element of list a. 

• int remove (int i); 

Remove and return the element with index i in this list. 

• void visit (List a); 


Prints all elements of the list 


Operation 

Array representation 

SLL representation 

get 

0(1) 

0(n) 

set 

0(1) 

0(n) 

add(int,data) 

O(n) 

0(n) 

add(data) 

0(1) 

0(1) 

remove 

O(n) 

O(n) 
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equals 0(n 2 ) 0(n ) 


addAll 


0(n 2 ) 
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0(n 2 ) 


Stacks and Queues 

These types of data structures are special cases of lists. Stack also called LIFO (Last In First Out) 
list. In this structure items can be added or removed from only one end. Stacks are generally 
represented either in array or in singly linked list and in both cases insertion/deletion time is 0(1), 
but search time is 0(n). 

Operations on stacks 

> boolean isEmpty (); 

Return true if and only if this stack is empty. Complexity is 0(1). 

> int getLast (); 

Return the element at the top of this stack. Complexity is 0(1). 

> void clear (); 

Make this stack empty. Complexity is 0(1). 

> void push (int elem); 

Add elem as the top element of this stack. Complexity is 0(1). 

> int pop 0; 

Remove and return the element at the top of this stack. Complexity is 0(1). 

The queues are also like stacks but they implement FIFO(First In First Out) policy. One end is for 
insertion and other is for deletion. They are represented mostly circularly in array for 0(1) 
insertion/deletion time. Circular singly linked representation takes 0(1) insertion time and 0(1) 
deletion time. Again Representing queues in doubly linked list have 0(1) insertion and deletion time. 

Operations on queues 

3. boolean isEmpty (); 

Return true if and only if this queue is empty. Complexity is 0(1). 

4. int size (); 

Return this queue’s length. Complexity is 0(n). 

5. int getFirst (); 

Return the element at the front of this queue. Complexity is 0(1). 

6. void clear (); 

Make this queue empty. Complexity is 0(1). 

7. void insert (int elem); 

Add elem as the rear element of this queue. Complexity is 0(1). 

8. int delete (); 

Remove and return the front element of this queue. Complexity is 0(1). 


Non -linear data structure: 

A data structure is said to be non- linear data structure if any item is attached with many of 
the items in specific ways. In another word, a data structure in which all the data are arranged in 
random manner is called non-linear data structure. Example: Tree, Graph etc. 


Tree Data Structures 

Tree is a collection of nodes. If the collection is empty the tree is empty otherwise it contains a 
distinct node called root (r) and zero or more sub-trees whose roots are directly connected to the 
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node r by edges. The root of each tree is called child of r, and r the parent. Any node without a 
child is called leaf. We can also call the tree as a connected graph without a cycle. So there is a 
path from one node to any other nodes in the tree. The main concern with this data structure is due 
to the running time of most of the operation require O(logn). We can represent tree as an array or 
linked list. 

Some of the definitions 

• Level h of a full tree has d hA nodes. 

• The first h levels of a full tree have 

1 + d + d 2 +.d 11 " 1 = (d h -l)/(d-l) 

Binary Search Trees 

BST has at most two children for each parent. In BST a key at each vertex must be greater than all 
the keys held by its left descendents and smaller or equal than all the keys held by its right 
descendents. Searching and insertion both takes 0(h) worst case time, where h is height of tree and 
the relation between height and number of nodes n is given by log n < h+1 <= n. for e.g. height of 
binary tree with 16 nodes may be anywhere between 4 and 15. 

When height is 4 and when height is 15? 

So if we are sure that the tree is height balanced then we can say that search and insertion has 


0(log n) run time otherwise we have to content wit 

l 0(n). 

Operation 

Algorithm 

Time complexity 

Search 

BST search 

0(log n ) best 0{n) worst 

Add 

BST insertion 

0(log n) best 0{n) worst 

Remove 

BST deletion 

0(log n ) best 0(ri) worst 


AVL Trees 

Balanced tree named after Adelson, Velskii and Landis. AVL trees consist of a special case in 
which the sub-trees of each node differ by at most 1 in their height. Due to insertion and deletion 
tree may become unbalanced, so rebalancing must be done by using left rotation, right rotation or 
double rotation. 


Operation 

Algorithm 

Time complexity 

Search 

AVL search 

0(log n) 

best, worst 

Add 

AVL insertion 

<9(log n) 

best, worst 

Remove 

AVL deletion 

0 (log n ) 

best, worst 


Priority Queues 

Priority queue is a queue in which the elements are prioritized. The least element in the priority 
queue is always removed first. Priority queues are used in many computing applications. For 
example, many operating systems used a scheduling algorithm where the next process executed is 
the one with the shortest execution time or the highest priority. Priority queues can be 
implemented by using arrays, linked list or special kind of tree (I.e. heap). 

• boolean isEmpty (); 

Return true if and only if this priority queue is empty. 


Jig Jj Aupendta fiaud 


(Jage 26 























J)eJ>ign (find. (ftna/giii of rflLgp'c.ith.m.ti (J)(fi(fi) fiacfa'im.ath.a tjo/Ze^e (Ji . j§c. ) 

• int size (); 

Return the length of this priority queue. 

• int getLeast (); 

Return the least element of this priority queue. If there are several least elements, return 
any of them. 

• void clear (); 

Make this priority queue empty. 

• void add (int elem); 

Add elem to this priority queue. 

• int delete(); Remove and return the least element from this priority queue. (If there are 
several least elements, remove the same element that would be returned by getLeast. 


Operation 

Sorted 

Unsorted 

Sorted 

Unsorted Array 

add 

0(h) 

0(1) 

0(h) 

0(1) 

removeLea 

0(1) 

0(h) 

0(1) 

0(h) 

getLeast 

0(1) 

0(h) 

0(1) 

0(h) 


Heap 

A heap is a complete tree with an ordering-relation R holding between each node and its 
descendant. Note that the complete tree here means tree can miss only rightmost part of the bottom 
level. R can be smaller-than, bigger-than. 

E.g. Heap with degree 2 and R is “bigger than”. 




Heap Sort Build a heap from the given set (O(n)) time, then repeatedly remove the elements from 
the heap (0(n log n)). 

Implementation 

Heaps are implemented by using arrays. Insertion and deletion of an element takes 0(log n) time. 


Operation 

Algorithm 

Time complexity 

add 

insertion 

<9(log n) 

delete 

deletion 

0 (log n) 

getLeast 

access root element 

0(1) 
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DIVIDE AND CONQUER ALGORITHMS 
(Sorting Searching and Selection) 



Sorting 

Sorting is among the most basic problems in algorithm design. We are given a sequence of items, 
each associated with a given key value. The problem is to permute the items so that they are in 
increasing (or decreasing) order by key. Sorting is important because it is often the first step in 
more complex algorithms. Sorting algorithms are usually divided into two classes, internal sorting 
algorithms, which assume that data is stored in an array in main memory, and external sorting 
algorithm, which assume that data is stored on disk or some other device that is best accessed 
sequentially. We will only consider internal sorting. Sorting algorithms often have additional 
properties that are of interest, depending on the application. Here are two important properties. 

In-place: The algorithm uses no additional array storage, and hence (other than perhaps the 
system’s recursion stack) it is possible to sort very large lists without the need to allocate 
additional working storage. 

Stable: A sorting algorithm is stable if two elements that are equal remain in the same relative 
position after sorting is completed. This is of interest, since in some sorting applications you sort 
first on one key and then on another. It is nice to know that two items that are equal on the second 
key remain sorted on the first key. 

Merge Sort 

This sorting algorithm based on the divide and conquers strategy. 

To sort an array A [1... r]: 

• Divide 

- Divide the n-element sequence to be sorted into two subsequences of each size n/2 

• Conquer 

- Sort the subsequences recursively using merge sort. When the size of the sequences is 1 

there is nothing more to do. 

• Combine 

- Merge the two sorted subsequences 


Tracing: 

A[ ] = {4, 7, 2, 6, 1,4,7,3,5,2,6} 

Solution: 

Dividing: 

= h 
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Merging: 




Algorithm: 

MergeSort(A, 1, r) 

{ 

If (1 < r) 

{ 

m = L(1 + r)/2j 
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MergeSort(A, 1, m) //Conquer 

MergeSort(A, m + 1, r) //Conquer 

Merge/A, 1, m+1, r) //Combine 

} 

} 


Merge/A,B,l,m,r) 

{ 

x=l, y=m; 
k=l; 

while(x<m && y<r) 

{ 

if(A[x] < A[y]) 

{ 

B[k]= A[x]; 

k++; 

x++; 

} 

else 

{ 

B[k] = A[y]; 

k++; 

y++; 

} 

} 

while(x<m) 

{ 

A[k] = A[xJ; 
k++; x++; 

} 

while(y<r) 

{ 

A[k] = A[y]; 
k++; y++; 

} 

for(i=l;i<= r; i++) 

{ 

A[i] = B[i] 

} 

} 


Time Complexity: 

Recurrence Relation for Merge sort: 

T(n) = 1 if n=l 

T(n) = 2 T(n/2) + O/n) if n> 1 

Solving this recurrence we get, 

Time Complexity = T(n) = O/n log n) 
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Space Complexity: 

It uses one extra array and some extra variables during sorting, therefore 
Space Complexity= 2n + c = O(n) 


Quick Sort 

• Divide 

Partition the array A[l.. .r] into 2 sub-arrays A[l.. .m] and A[m+l...r], such that each 
element of A[l...m] is smaller than or equal to each element in A[m+l...r]. Need to find 
index p to partition the array. 

• Conquer 

Recursively sort A[p...q] and A[q+1..,r] using Quick sort 

• Combine 

Trivial: the arrays are sorted in place. No additional work is required to combine them. 


i-^ -—-- n ---- 


Partition 1 

ke\ r 

Partition 2 

Value s <key 

| Values>key 


5 3 2 6 4 

x 

5 3 2 6 4 

x 

5 3 2 3 4 

x 


5 3 2 3 4 


(1 3 2 3 4) 


(1 3 2 3 4) 

x y 

(1 3 2 3 4) 

y x 

1 (3 2 3 4) 

x y 

1 (3 2 3 4) 

y x 


i 

l 

l 

l 

y 

5 

P 

5 

5 

5 


3 7 

y 

3 7 

y {swapx&y} 

6 7 

y 

6 7 


x 

(6 


{Since x and y are Cross so swap y and 

pivot} 


7) 


(6 7) 

x y 

(6 7) 

y x 

6 (7) 


5 6 7 
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1 (2 3 3 4) 

P 

5 6 

7 

1 (2) 3 (3 4) 

x y 

5 6 

7 

1 2 3 (3 4) 

y x 

5 6 

7 

1 2 3 3 4 5 6 

Algorithm: 

QuickSort(A,l,r) 

{ 

if(l<r) 

{ 

p = Partition (A, 1, r); 

7 


Quicksort (A, 1, p-1); 

Quicksort (A, p+1, r); 

} 

} 

Partition(A,l,r) 

{ 

x =1; y =r; p = A[l]; 
while(x<y) 

{ 

do { 

x++; 

}while(A[x] <= p); 
do { 

y-; 

} while(A[y] >=p); 
if(x<y) 

swap(A[x],A[y]); 

} 

A[l] = A[y]; A[y] = p; 

return y; //return position of pivot 

} 

Time Complexity: 

We can notice that complexity of partitioning is O(n) because outer while loop executes cn times. 
Thus recurrence relation for quick sort is: 

T(n) = T(k) + T(n-k-l) + O(n) 

Best Case: 

Divides the array into two partitions of equal size, therefore 
T(n) = 2T(n/2) + O(n), Solving this recurrence we get, 

■=> Time Complexity = 0(n log n) 
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n 

n 

n 

n 


n 

@(n lg n) 

Worst case: 

When array is already sorted or sorted in reverse order, one partition contains n-1 items and 
another contains zero items, therefore 
T(n) = T(n-l) + 0(1), Solving this recurrence we get 
■=> Time Complexity = 0(n 2 ) 

Case between worst and best (Average case): 

Average case occurred when the elements are divided into ratio 9:1. 

Then the recurrence relation for this case is, 

T (n) = T (9n/10) + T (n/10) + O (n), 

By solving this recurrence we get 

Time Complexity = T(n) = O (n log n) 
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.:»>• 


n!2 


'n/2 


n/4 


n/4 


n/4 ^ ' S% 'n/4 


->■ 


lg n nfi ^8 nft ^8 n/S 'n/8 n/S 'n/8. 

I \ i \ I \ t ' 11 ' * ' * * ’ 


l l i I l 

. 

I i i 


1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1. . . 


A A 


l°g 10 n 


10 SlO/9 n 


1 



0(n lg n) 


Randomized Quick Sort: 

The algorithm is called randomized if its behavior depends on input as well as random value 
generated by random number generator. The beauty of the randomized algorithm is that no 
particular input can produce worst-case behavior of an algorithm. IDEA: Partition around a 
random element. Running time is independent of the input order. No assumptions need to be made 
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about the input distribution. No specific input elicits the worst-case behavior. The worst case is 
determined only by the output of a random-number generator. Randomization cannot eliminate the 
worst-case but it can make it less likely! 

Algorithm: 

RandQuickS ort( A,l,r) 

{ 

if(l<r) 

{ 

m = RandPartition (A, 1, r); 

RandQuickSort (A, 1, m-1); 

RandQuickSort (A, m+1, r); 

} 

} 

RandPartition (A, 1, r) 

{ 

k = random (1, r); //generates random number between i and j including both. 

swap(A[l],A[k]); 

return Partition/A, 1, r); 


Partition (A, 1, r) 

{ 

x =1; y =r; p = A[l]; 
while(x<y) 

{ 

do { 

x++; 

}while(A[x] <= p); 
do { 

y-; 

} while(A[y] >=p); 
if(x<y) 

swap(A[x]A[y]); 

} 

A[l] = A[y]; A[y] = p; 

return y; //return position of pivot 

} 

Time Complexity: 

Worst Case: 

T(n) = worst-case running time 

T(n) = maxi < k <n-i (T(k) + T(n-k)) + O(n).(1) 

Where, k is some partitioned point produced by random number generator. 

Now, by using substitution method to show that the running time of Quick sort is O (rT) 
Guess T (n) = O (n 2 ) 

=> T (n) < cn 2 .(2) 

Now proof this by using mathematical induction 
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Basic step: - for n=l, 

T(l) < c. I 2 

Or 1< c which is true for c >0 
Inductive step:- 

Let’s assume that it is true for all k < n 
i.e. T(k) < ck“ for any k < n 
it is also true for k=n-k, 
i.e. T(n-k) < c (n-k) 2 

Now equation 1 becomes, 

T(n) < max i <k<n-i (ck 2 + c(n-k) 2 ) + O(n) 

= c • maxi < k < n -i (k 2 + (n-k) 2 ) + O(n) 

The expression k" + (n-k) achieves a maximum over the range 1 < k < n-1 at one of the endpoints 
max! < k < n .! (k 2 + (n - k) 2 ) = l 2 + (n - l) 2 = n 2 - 2(n - 1) 

T(n) < cn 2 - 2c(n - 1) + O(n) 

< cn 2 

=> T(n)=0(n 2 ) 


Average Case: 

To analyze average case, assume that all the input elements are distinct for simplicity. If we are to 
take care of duplicate elements also the complexity bound is same but it needs more intricate 
analysis. Consider the probability of choosing pivot from n elements is equally likely i.e. 1/n. 

Now we give recurrence relation for the algorithm as 
/7 | 

T(n) = 1/n 

For some k = 1, 2... n-1, T(k) and T(n-k) is repeated two times 


ir i 

T(n) = 2/n 

n— 1 

nT(n) = 2X r ^) + 0(n 2 ).(1) 

k=l 

Similarly 

n —2 

(n-l)T(n-l) = 2X 7 ' (A) +0(n-l) 2 .(2) 

k=l 

Subtracting equation 1 from 2 we get, 

n —1 n —2 

nT(n) - (n-l)T(n-l) = 2Z^) +0(n 2 )-2E r ^ +0(n-l) 2 

k=\ k =1 

n —1 n—2 

or, nT(n) - (n-l)T(n-l) = 2 -2 + 0(n 2 ) - O(n-l) 2 

k= 1 k= 1 

=2T(n-l) + n 2 - n 2 +2n -1 
=2T(n-l) +2n -1 

or, nT(n) = (n-l)T(n-l) + 2T(n-l) +(2n -1) 

= T(n-l) [(n-1)+2]+(2n-1) 
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= (n+l)T(n-l) + (2n -1) 

Or, nT(n) - (n+1) T(n-l) = 2n-l 
Dividing both sides by n (n+1) we get 

T(n)/(n+l) = T(n-l)/n +(2n -l)/n(n+l) 

Let An = T(n) /(n+1) 

■=> A n = A n -1 + (2n-1 )/n(n+1) 

/r 

■=> An = 3/&(i-hQ [since recurrence rel" of sum of first n natural number is s n =s n -i +n] 

M 

<=> An ~ y^'/A/'+l) 

/=! 

n 

o An =::2 2 /(/+1 > 

;=[ 

This is a Harmonic series, 

Hence An ~ 21ogn.(3) 

Since An = T(n) /(n+1) 

Or, 2 log(n) = T(n)/(n+1) 

=> T(n) = 2 (n+1) log(n) 

Or, T(n) = 2n logn +21ogn 
=> T(n) = 0(n log n) 


Heap Sort 

A heap is an almost complete binary tree of n nodes such that the value of each node is less than or 
equal to the value in parent node. This type of heap is called max heap. By default the heap is max 
heap. 



Fig Heap of given array of elements 


Array Representation of Heaps 

A heap can be stored as an array A. 

- Root of tree is A[l] 

- Left child of A[i]=A[2i] 

- Right child of A[i] = A[2i + 1] 

- Parent of A[i] = A[ Li/2j ] 

- Heapsize[A] < length[A] 
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The elements in the sub-array A[(|_n/2j+l).. n] are leaves 



Max-heaps (largest element at root), have the max-heap property: 

- for all nodes i, excluding the root: 

A[PARENT(i)] > A[i] 

Min-heaps (smallest element at root), have the min-heap property: 

- for all nodes i, excluding the root: 

A[PARENT(i)] < A[i] 


Adding/Deleting Nodes 

New nodes are always inserted at the bottom level (left to right) and nodes are removed from the 
bottom level (right to left). 



Operations on Heaps 

> Maintain/Restore the max-heap property 

- MAX-HEAPIFY 

> Create a max-heap from an unordered array 

- BUILD-MAX-HEAP 

> Sort an array in place 

- HEAPSORT 
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Heapify Property 

If any node violets the heap property then swap this node with it’s larger children to maintain the 
heap property, this operation is called heapify. 

MAX-HEAPIFY operation: 

• Find location of largest value of: 

A[ i ], A[ Left( i)] and A[ Right( i) ] 

• If not A[ i ], max-heap property does not hold. 

• Exchange A[ i ] with the larger of the two children to preserve max-heap property. 

• Continue this process of compare/exchange down the heap until sub-tree rooted at i is a 
max-heap. 

• At a leaf, the sub-tree rooted at the leaf is trivially a max-heap. 

Example: 

l i 



Algorithm; 

Max-Heapify(A, i, n) 

{ 

1 = Left(i) 
r = Right(i) 
large st=i; 

if 1 < n and A[l] > A[largest] 
largest = 1 

if r < n and A[r] > A[largest] 
largest = r 
if largest ^ i 

exchange (A[i] , A[largest]) 
Max-Heapify(A, largest, n) 
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Analysis: 

In the worst case Max-Heapify is called recursively h times, where h is height of the heap 
and since each call to the heapify takes constant time 
Time complexity = 0(h) = O(logn) 


Building a Heap 

The process of converting a given binary tree into a heap by performing heap operation on each of 
the non-leaf node of the tree is known as building heap operation. 

Convert an array A[1 ... n] into a max-heap (n = length[A]). Apply MAX-HEAPIFY on elements 
between 1 and |_n/2_|. 

Example: 
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Algorithm: 

Build-Max-Heap(A) 

{ 

n = length[A] 

for (i= Ln/2j ; i>= 1; i—) 

{ 

MAX-HEAPIFY(A, i, n); 

} 

} 


Time Complexity: 

Running time: Loop executes O(n) times and complexity of Heapify is O(logn), therefore 
complexity of Build-Max-Heap is 0(n log n). 

This is not an asymptotically tight upper bound 
Heapify takes 0(h) 

=> The cost of Heapify on a node i is proportional to the height of the node i in the tree 

h 

=> T(n) = Z Hi h; 

z=o 

hi = h - i height of the heap rooted at level i 
ni = 2 1 number of nodes at level i 

h 

=> T(n) = Z 2 a ( h-i) 

i =0 
h 

=^>T(n)=Z 2 h ( h-i) / 2 h i 

i=0 

Let k= h-i 

h 

=> T(n) = 2 h Z k / 2 k 

/•=0 

h oo 

=i> T(n) 2 h Z k/2 k <2 h Z k/2 k . (1) 

1=0 *=0 
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00 

We know that, I x k = 1/(1-x) for x<l 

/=o 

Differentiating both sides we get, 

oo 

X k x k_1 =l/(l-x) 2 

i=0 
00 

X kx k = x/(l-x) 2 

1=0 

Put x=l/2 

00 

X k/2 k = l/(l-x) 2 = 2 

i= 0 

Now equation 1 becomes, 

00 

T(n) < 2 h X k / 2 k 

i= 0 

< 2 h * 2 
< 2 log n * 2 
< 2 n 

=> T(n) = 0(n) 


Heapsort 

• Build a max-heap from the array 

• Swap the root (the maximum element) with the last element in the array 

• “Discard” this last node by decreasing the heap size 

• Perform Max-Heapify operation on the new root node 

• Repeat this process until only one node remains 

Example: A[ ] ={4, 1, 3, 2, 16, 9, 10, 14, 8, 7} 
sol 11 : At first construct a binary tree of given array, 
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Now construct a heap of given tree as, 
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Q 

swap(l, 2) / 

© © 


heapify(A,l) 0 

0 0 . 


© © © © (f) © ct) © 

173 ©©©© 00 / 
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1 

2 

3 

4 

7 

8 

9 

10 

14 

16 


Algorithm: 

HeapSort(A) 

{ 

BuildHeap(A); //into max heap 
n = length[A]; 
for(i = n ; i >= 2; i—) 

{ 

swap(A[l],A[n]); 
n = n-1; 

Heapify(A,l); 

} 

} 

Analysis: 

Build heap takes O(n) time 

For loop executes at most O(n) time 

Within for loop heapify operation takes at most 0(log n) time 
Thus total time complexity T(n) = O(n) + O(n) (log n) 

=> T(n) = 0(n log n) 
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Sort 

Worst Case 

Average Case 

ikst Case 

ComnKMits 

Insertion Sort 

0(n 2 > 

©(n 2 ) 

0(n) 


Selection Sort 

0(n 2 ) 

Q(n 2 ) 

0(rT) 

( # Un stable) 

Bubble Sort 

0(n 2 > 

©( n-> 

Q(n-) 


Merge Sort 

©(nlogn) 

©(nlogn) 

©(nlogn) 

Requires Memory 

Heap Sort 

©(nlogn) 

©(nlogn) 

©(nlogn) 

* Large const ants 

Quick Sort 

0(.n 2 ) 

©(nlogn) 

©(nlogn) 

♦Small constants 


Searching 

Searching is to look for something in a list or an array. 

Sequential Search 

Simply search for the given element left to right and return the index of the element, if found. 
Otherwise return “Not Found”. 

Algorithm: 

LinearSearch(A, n, key) 

{ 

for(i=0;i<n;i++) 

{ 

if(A[i] ■ key) 
return I; 

} 

return -1;//-1 indicates unsuccessful search 

} 

Analysis: 

Time complexity T(n) = O(n) 

Binary Search: 

Search a sorted array by repeatedly dividing the search interval in half. Begin with an interval 
covering the whole array. If the value of the search key is less than the item in the middle of the 
interval, narrow the interval to the lower half. Otherwise narrow it to the upper half. Repeatedly 
check until the value is found or the interval is empty. 


Searching in first half of array searching in second half of array 
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Steps: 

Algorithm is quite simple. It can be done either recursively or iteratively: 

1. get the middle element; 

2. if the middle element equals to the searched value, the algorithm stops; 

3. otherwise, two cases are possible: 

o If searched value is less, than the middle element. In this case, search item in first 
half. 

o If searched value is greater, than the middle element. In this case, search item in 
second half. 

Continue this process until we get desired element in the list or the list is empty. 

Example 1: Find 6 in {-1, 5, 6,18,19, 25, 46, 78,102,114}. 

Step 1 (middle element is 19 >6): -1 5 6 18 19 25 46 78 102 114 

Step 2 (middle element is 5 < 6): -1 5 6 18 _1.9.25_ 46 .78. J02..114 

Step 3 (middle element is 6 == 6): 6 18 


Example 2: Find 103 in {-1, 5, 6,18,19, 25, 46, 78,102,114}. 


Step 1 (middle element is 19 < 103): -1 5 6 18 19 25 46 78 102 114 
Step 2 (middle element is 78 < 103): 25 46 78 102 114 


Step 3 (middle element is 102 < 103): - 102 11 


Step 4 (middle element is 114 > 103): 114 

Step 5 (searched value is absent): 


Algorithm 

BinarySearch(A, 1, r, key) 

{ 

if(l= = r ) 

{ 

if(key = = A[l]) 

return 1+1; //index starts from 0 

else 

return 0; 

} 

else 

{ 

m = (1 + r) /2 ; //integer division 
if(key = = A[m] 

return m+1; 
else if (key < A[m]) 
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return BinarySearch(l, m-1, key) ; 

else 


} 


return BinarySearch(m+l, r, key) ; 


Analysis: 

From the above algorithm we can say that the running time of the algorithm is: 

T(n) = T(n/2) + 0(1) 

= O(logn) . 

In the best case output is obtained at one run i.e. 0(1) time if the key is at middle. In the worst case 
the output is at the end of the array so running time is 0(log n) time. In the average case also 
running time is O(logn). For unsuccessful search best, worst and average time complexity is 
O(logn). 


Selection 

i th order statistic of a set of elements gives i th largest(smallest) element. In general let’s think of i th 
order statistic gives i th smallest. Then minimum is first order statistic and the maximum is last 
order statistic. Similarly a median is given by i th order statistic where i = (n+l)/2 for odd n and i = 
n/2 and n/2 + 1 for even n. This kind of problem commonly called selection problem. 

This problem can be solved in 0(n log n) in a very straightforward way. First sort the elements in 
0(n log n) time and then pick up the i th item from the array in constant time. What about the linear 
time algorithm for this problem? The next is answer to this. 

Nonlinear general selection algorithm 

We can construct a simple, but inefficient general algorithm for finding the k th smallest or k lb 
largest item in a list. This is efficient when k is small. To accomplish this, we simply find the most 
extreme value and move it to the beginning until we reach our desired index. 

Select(A, k, n) 

{ 

for( i=0; i < k; i++) 

{ 

minindex = i; 
minvalue = A[i]; 
for(j=i+l;j <n;j++) 

{ 

if( A[j] < minvalue) 

{ 

minindex = j; 
minvalue = A[j]; 

} 

swap(A[i], A[minindex]); 

} 

} 
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return A[k]; 

} 

Analysis: 

When i=0, inner loop executes n-1 times 
When i=l, inner loop executes n-2 times 
When 1=2, inner loop executes n-3 times 


When i=k-l inner loop executes n-(k-l+l) times 

Thus, Time Complexity = (n-1) + (n-2) +.(n-k) 

In worst case if k=n then, 

T(n)=0+1+2+3+4+.+(n-2)+(n-l) 

=n(n-1 )/2 [since s n =n(n+l)/2] 

= 0(n 2 ) 


Selection in expected linear time 

This problem is solved by using the “divide and conquer” method. The main idea for this problem 
solving is to partition the element set as in Quick Sort where partition is randomized one. 
Algorithm: 

RandSelect(A, 1, r, i) 

{ 

if(l = =r) 

return A[ 1]; 

p = RandPartition(A, 1, r); 
k = (p - 1 + 1); 
if(i < k) 

return RandSelect(A, 1, p-1, i); 

else 

return RandSelect(A, p+1, r, i - k); 

} 

RandPartition (A, 1, r) 

{ 

k = random (1, r); //generates random number between i and j including both. 

swap(A[l],A[k]); 

return Partition/A, 1, r); 

} 


Partition (A, 1, r) 

{ 

x=l; y=r;p = A[l]; 
while(x<y) 

{ 

do { 

x++; 

}while(A[x] <= p); 
do { 

y--; 
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} while(A[y] >=p); 

if( x <y) 

swap(A[x],A[y]); 

} 

A[l]=A[y]; 

A[y]=p; 

return y; //return position of pivot 


Analysis: 

Since our algorithm is randomized algorithm no particular input is responsible for worst case 
however the worst case running time of this algorithm is 0(n 2 ). This happens if every time 
unfortunately the pivot chosen is always the largest one (if we are finding minimum element). 
Assume that the probability of selecting pivot is equal to all the elements i.e 1/n then we have the 
recurrence relation,^ ^ ^-Conquer time 

T(n) = l/n( Z T(max(j, n - j))) + O(r0 ~ 'Dividing time 

Where, max(j, n-j) = j, if j >= \n / 2\ 
and max(j, n-j) = n-j, otherwise. 

Here every T(j)orT(n-j) will repeat twice, one time from 1 to \n / 2\ and second time from 
\n / 2] to (n-1), so we can write, 

n -1 

T(n) = 2/n( Z T(j)) + O(n).(1) 

j=n / 2 

Using substitution method, 

Guess T(n) = O(n) 

Then we have to show that T(n) < c n.(2) 

Basic step: for n=l, 

T(l) < c . 1 

=> 1 < c which is true for all c >0 

Inductive step: let’s assume that it is true for all j<n 

Then T(j)<cj .(3) 

Substituting on the relation (1) we get, 

n-1 

T(n) < 2/n Z cj + O(n) 

j=nl 2 

n -1 |-1 

Or, T(n) < 2/n { Z cj - Z c j )+ °( n ) 

j~ l /=! 


= 2/n { 


n(n - 1 ) 
2 


,n 



1 + 1 ) 

-.c } +0(n) 


=2/n { c. 


n(n - 1 ) 




.c} + O(n) 
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= c(n-l) - ^ (n/2 - 1) + 0(n) 

= c n - c - c n/4 + c/2 + c n 
= c n - [c + c n/4 - c/2 - c n] 
< c n 

=>T(n) = 0(n) 


Selection in worst case linear time: 

• Divide the n elements into groups of 5 elements. 

• Find the median of each group, which gives |_n/5j medians. 

• Recursively SELECT the median x of the Ln/5J medians to be the pivot element. Let k be 
the index of such a pivot element. 

• Partition the n elements around pivot: 
if (i==k) then 

Return (A[k]) 
else if (i < k) then 

Recursively to find /' lh smallest element in first partition 

else 

Recursively to find (i-kf 1 smallest element in second partition 



From above figure at least half the medians are < x 
Since there are l_n/5J medians 

Therefore } ll^- medians are < x 
2 

Or, n/10 medians are < x 

Since each medians contribute 3 elements which are < x 

=> 3 n/10 elements are < x 

Similarly, at least 3/7/1 Oelements are > x 

Since there are total n elements 

If 3 n/10 elements are < x 

Then (n-3 n/10) =7n/10 elements are > x 

Now it’s recurrence relation is, 

T(n) = T(n/5) + T(7n/10) + O(n) 

Lets guess T(n)=0(n) 
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=> T(n) < c n.(1) 

Now proof this by using mathematical induction as, 

Basic step: for n=l, 

T(l)<c . 1 

=> 1 < c which is true for all c >0 
Inductive step: let’s assume that it is true for all k<n 

Then T(k) < c k.(2) 

It is also true for k=n/5 and 7n/10, 

=> T(n/5) < c. n/5 
Also T(7n/10) < c. 7n/10 
Now from given recurrence relation, 

T(n) = T(n/5) + T(7n/10) + 0(n) 

T(n) < c n/5 + c. 7n/10 + O(n) 

<cn-4cn/5 + 7cn/10 + c n 

< c n - c n[4/5 - 7/10 - 1] 

< c n 

=> T(n) < c n 
Thus T(n)= O(n) 

Max and Min Finding 

Here our problem is to find the minimum and maximum items in a set of n elements. Iterative 
Divide and Conquer Algorithm for finding min-max: 

Main idea behind the algorithm is: if the number of elements is 1 or 2 then max and min are 
obtained trivially. Otherwise split problem into approximately equal part and solved 
recursively. 

MinMax(l, r) 

{ 

if(l = = r) 

max = min = A[l]; 

else if(l = r-1) 

{ 

if(A[l] < A[r]) 

{ 

max = A[r]; 
min = A[l]; 

} 

else 

{ 

max = A[l]; 
min = A[r]; 

} 

} 

else 

{ 

//Divide the problems 

mid = (1 + r)/2; //integer division 
//solve the subproblems 
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{min,max} =MinMax(l,mid); 

{mini,maxi }= MinMax(mid +l,r); 

//Combine the solutions 

if(maxl > max) max = maxi; 
if(minl < min) min = mini; 

} 

} 

Analysis: 

We can give recurrence relation as below for MinMax algorithm in terms of number of 
comparisons. 

T(n) = 2T( n / 2 ) + 1 , if n>2 
T(n) = 1 , if n <2 

Solving the recurrence by using master method complexity is (case 1) 0(n). 


Matrix Multiplication 

Given two A and B n-by-n matrices our aim is to find the product of A and B as C that is also n- 
by-n matrix. We can find this by using the relation 

n 

C(i,j)= Z A(i,k)B(k,j) 

k=\ 

MatrixMultiply(A,B) 

{ 

for(i=0;i<n;i++) 

{ 

for(j=0;j<n;j++) 

{ 

for(k=0 ;k<n ;k++) 

{ 

C[i][j] = C[i][j]+ A[i][k]*B[k][j]; 

} 

} 

} 

} 

Analysis: 

Using the above formula we need O(n) time to get C(i,j). There are n 2 elements in C hence the 
time required for matrix multiplication is 0(n ). We can improve the above complexity by 
using divide and conquer strategy. 


Divide and Conquer Algorithm for Matrix Multiplication 

Divide the n x n square matrix into four matrices of size n/2 x n/2. The basic calculation is 
done for matrix of size 2x2. 




r 



Cll C12 


All A12 


Bll B12 

C21 C22 


A21 A22 


B21 B22 



C J 




Where 


Cl 1= All xBll + A12xB21 
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02= All x B12 + A12 x B22 
C21= A21 x Bll + A22 x B21 
C22= A21 x B12 + A22 x B22 

Now, we can write recurrence relation for this as 
T(n) = b if n<2 

T(n)= 8T(n/2) + cn 2 if n>2 
Solving this we get, T(n) = 0(n 3 ) 


Strassens’s Matrix Multiplication 

Strassen showed that 2x2 matrix multiplication can be accomplished in 7 multiplication and 
18 additions or subtractions. 

The basic calculation is done for matrix of size 2x2. 




r 



Cll C12 


All A12 


Bll B12 

C21 C22 


A21 A22 


B21 B22 



X. -J 




Where; 

Pi = (An+ A22XB11+B22) 

P2 = (A21 + A22) * Bn 
P3 = An * (B12 - B22) 

P4 = A22 * (B21 - Bn) 

P5 = (An + A12) * B22 

?6 = (A21 - An) * (Bn + B12) 

P7 = (A12 - A22) * (B21 + B22) 

Cn=Pi + P4-P5 + P7 

C12 = P 3 + P 5 

C21 = P2 + P4 

C22 = Pi + P3 - ?2 + P6 

Now, We can write recurrence relation for this as 
T(n) = b if n<2 

T(n)= 7T(n/2) + cn 2 if n>2 
Solving this we get, T(n) = 0(n 2 ' 81 ) 
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Unit 2 


Chapter: 2 

Dynamic Programming 


DP technique is among the most powerful for designing algorithms for optimization problems. 
Dynamic programming problems are typically optimization problems (find the minimum or 
maximum cost solution, subject to various constraints). The technique is related to divide-and- 
conquer, in the sense that it breaks problems down into smaller problems that it solves recursively. 
However, because of the somewhat different nature of dynamic programming problems, standard 
divide-and-conquer solutions are not usually efficient. The basic elements that characterize a 
dynamic programming algorithm are: 

Substructure: Decompose your problem into smaller (and hopefully simpler) sub¬ 
problems. Express the solution of the original problem in terms of solutions for smaller 
problems. 

Table-structure: Store the answers to the sub-problems in a table. This is done because 
sub-problem solutions are reused many times. 

Bottom-up computation: Combine solutions on smaller sub-problems to solve larger sub¬ 
problems. 

The most important question in designing a DP solution to a problem is how to set up the sub¬ 
problem structure. This is called the formulation of the problem. Dynamic programming is not 
applicable to all optimization problems. There are two important elements that a problem must 
have in order for DP to be applicable. 

Optimal substructure: (Sometimes called the principle of optimality.) It states that for the 
global problem to be solved optimally, each sub-problem should be solved optimally. (Not 
all optimization problems satisfy this. Sometimes it is better to lose a little on one sub¬ 
problem in order to make a big gain on another.) 

Polynomially many sub-problems: An important aspect to the efficiency of DP is that the 
total number of sub-problems to be solved should be at most a polynomial number. 

Fibonacci numbers 

Recursive Fibonacci revisited: 

In recursive version of an algorithm for finding Fibonacci number we can notice that for each 
calculation of the Fibonacci number of the larger number we have to calculate the Fibonacci 
number of the two previous numbers regardless of the computation of the Fibonacci number that 
has already be done. So there are many redundancies in calculating the Fibonacci number for a 
particular number. Fet’s try to calculate the Fibonacci number of 4. The representation shown 
below shows the repetition in the calculation. 
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In the above tree we saw that calculations of fib(O) is done two times, fib(l) is done 3 times, fib(2) 
is done 2 times, and so on. So if we somehow eliminate those repetitions we will save the running 
time. 

Algorithm: 

DynaFibo(n) 

{ 

A[0] = 0; 

A[l]= 1; 

for(i = 2 ; i <=n ; i++) 

A[i] = A[i-2] +A[i-1] ; 
return A[n] ; 

} 

Analysis 

Analyzing the above algorithm we found that there are no repetition of calculation of the sub¬ 
problems already solved and the running time decreased from 0(2 n/2 ) to O(n). This reduction was 
possible due to the remembrance of the sub-problem that is already solved to solve the problem of 
higher size. 


0/1 Knapsack Problem 

Statement: A thief has a bag or knapsack that can contain maximum weight W of his loot. There 
are n items and the weight of i th item is Wi and it worth v,. An amount of item can be put into the 
bag is 0 or 1 i.e. x; is 0 or 1. Here the objective is to collect the items that maximize the total profit 
earned. 

Let W=Capacity of Knapsack 
n=No. of items 

w = {wl, w2,., w n } = weights of items 

V= {vl, v2, v3,., v n } = value of items 

C[i, w] = maximum profit earned with item i and with knapsack of capacity w then 
The recurrence relation for 0/1 knapsack problem is given as, 
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C[i, w] = < 


0 

C[i-l,w] 

Max{vi + C[i-l,w-wi], c[i-l,w] 


if i=0 or w=0 
if wi > w 
if i>0 and w>wi 


Algorithm: 

DynaKnapsack(W,n,v,w) 

{ 

for(w=0; w<=W; w++) 

C[0,w] = 0; 
for(i=l; i<=n; i++) 

C[i,0] = 0; 
for(i=l; i<=n; i++) 

{ 

for(w=l; w<=W;w++) 

{ 

if(w[i]<w) 

{ 

if v[i] +C[i-l,w-w[i]] > C[i-l,w] 

{ 

C[i,w] = v[i] +C[i-l,w-w[i]]; 

} 

else 

{ 

C[i,w] = C[i-l,w]; 

} 

} 

else 

{ 

C[i,w] = C[i-l,w]; 

} 

} 

} 

} 


Analysis 

For run time analysis examining the above algorithm the overall run time of the algorithm is 
O(nW). 
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Example 

Let the problem instance be with 7 items where v[ ] = {2,33,4,4,5,7}and w[ ] = {3,5,7,43,9,2}and 
W = 9. 

sol 11 : for i=0 or w=0 
c[i,w]=0 

i.e. c[0,l]0=c[0,l]= c[0,2]= c[0,3]= c[0,4]= c[0,5]= c[0,6]= c[0,7]= c[0,8]= c[0,9]= c[l,0]=.=0 

c[l,l]=c[0,l]=0 since Wj>W i.e. 3>1 so it satisfied second case of the recurrence relation 

c[l,3]= max{vl+c[0,3-3], c[0,3] }=max{2+0,0}=2 since W >= wl i.e. 3>=3 so it satisfied the third 
case. 

Continue this process to calculate value of each cell and finally we get following table, 


w 

i 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

0 

0 

0 

2 

2 

2 

2 

2 

2 

2 

2 

0 

0 

0 

2 

2 

3 

3 

3 

5 

5 

3 

0 

0 

0 

2 

2 

3 

3 

3 

5 

5 

4 

0 

0 

0 

2 

4 

4 

4 

6 

6 

7 

5 

0 

0 

0 

4 

4 

4 

6 

8 

8 

8 

6 

0 

0 

0 

4 

4 

4 

6 

8 

8 

8 

7 

0 

0 

7 

7 

7 

11 

11 

11 

13 

15 


Profit= C[7][9]=15 


Example 2: 

W=3 

Items ={il, i2, i3} 
wi = {1, 2, 3} 
vi ={2,3, 4} 
soln: do itself 

We get max profit=5 


Matrix Chain Multiplication 

Chain Matrix Multiplication Problem: Given a sequence of matrices Ai, A 2 ,.A n and 

dimensions po, pi,.p n , where A; is of dimension p,-i x p,, determine the order of multiplication 

that minimizes the number of operations. 


Important Note: This algorithm does not perform the multiplications; it just determines the best 
order in which to perform the multiplications. 

Although any legal parenthesization will lead to a valid result, not all involve the same number of 
operations. 

Consider the case of 3 matrices: Al be 5 x 4, A2 be 4 x 6 and A3 be 6 x 2. 
multCost[((AlA2)A3)] = (5.4.6) + (5.6.2) = 180 
multCost[(Al(A2A3))] = (4.6.2) + (5.4.2) = 88 
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Even for this small example, considerable savings can be achieved by reordering the evaluation 
sequence. 

Let denote the result of multiplying matrices i through j. It is easy to see that A k . .j is a p;-i x pj 
matrix. So for some k total cost is sum of cost of computing A k .. k , cost of computing A k +i...j, and 
cost of multiplying A k .. k and A k +i...j. 




Here check all the possible sequences of matrices for all possible choices of k and take best 
sequence among them. 


Recursive definition of optimal solution: let m[j,j] denotes minimum number of scalar 
multiplications needed to compute A k j. 


C[i, w] = < 


0 if i=j [if sequence contain only one matrix] 

Min i< k<j {m[I,k]+ m[k+l,j] + Pi-ip k pj ifi<j 


Note: 

mi_ k has dimension i-1 X k 
And m k+k j has dimension k X j 


Algorithm: 

Matrix-Chain-Multiplication(p) 

{ 

n =length[p] 
for( i= 1 i<=n i++) 

{ 

m[i, i]= 0 

} 

for(/=2; /<= n; /++) 

{ 

for( = 1; i<=n-/+l; i++) 

{ 

j=i+/-l 

m[i,j] = °° 


Jig Jj hupenxLta fiaud 


(Jage. 58 













fiagatmatha 'College (J . $c. CjS^J) 


J)eiign (find (finalgt>it> of filgotithm* (J)(fi(fi) 

for(k= i; k<= j-1; k++) 

{ 

c= m[i, k] + m[k + 1, j] + p[i-l] * p[k] * p[j] 
if c < m[i, j] 

{ 

m[i,j] = c 
s[i, j] =k 

} 

} 

} 

} 

return m and s 

} 

Analysis 

The above algorithm can be easily analyzed for running time as 0(n 3 ), due to three nested loops. 
The space complexity is 0(n 2 ) . 

Example: 

Consider matrices Al, A2, A3 and A4 of order 3x4, 4x5, 5x2 and 2x3. Then find the optimal 
sequence for the computation of multiplication operation. 

M Table (Cost of multiplication) S Table (points of parenthesis) 



\j 

i\ 

1 

2 

3 

4 

1 


1 

1 

3 

2 



2 

3 

3 




3 

4 






For m[l,l]= m[2,2]= m[3,3]= m[4,4]=0 

m[l,2]=min {m[l, l]+m[2,2]+p0*pl*p2}=min{0+0+3*4*5}=60 
m[2,3]= min {m[2, 2]+m[3,3]+pl*p2*p3} = (0+0+4*5*2)=40 

m[l,3]=min {{m[l, l]+m[2,3]+p0*pl*p3}, { m[l, 2]+m[3,3]+p0*p2*p3}}=min{(0+40+3*4*2),( 
60+0+3*5*2)}= min {64,90}= 64 
and so on. 

Now the optimal multiplication cost=82 with the optimal sequence is 
(A1A2A3A4) => ((A1A2A3)(A4)) => (((A1)(A2A3))(A4)) 

This means at first multiply matrix A2 and A3 then multiply their result with matrix Al and finally 
multiply their result with A4. 
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Longest Common Subsequence Problem(LCS) 

This method is used to test the matching or similarities between the two strings. 

Given two sequences X = (xi, x 2 ,.,x m ) and Z = (zi, z 2 ,.,z k ), we say that Z is a 

subsequence of X if there is a strictly increasing sequence of k indices (ii, i 2 ,.i k ) (1 < i i < i 2 < 

.< i k ) such that Z = (Xu, X i2 .X ik ). 

For example, let X = (ABRACADABRA) and let Z = (AADAA), then Z is a subsequence of X. 
Given two strings X and Y, the longest common subsequence of X and Y is a longest sequence Z 
that is a subsequence of both X and Y. 

For example, let X = (ABRACADABRA) and 

Let Y = (YABB AD ABB AD). Then the longest common subsequence is 
Z = (ABADABA) 


Recurrence relation for LCS: 

Let xi and yj represent any two sequences of characters. 

L[i, j] represents the LCS of xi and yj then its recurrence relation is, 

r 

0 if i=0 or j=0 {If either of the 

sequence is empty} 


L[i,j] = 


< 


L[i-1, j-l]+l 


if x; = yj {if last character of 
both sequences match} 


^ max {L[i-1, j], L[i, j-1]} if i>0 , j>0 and xi^ yj 

{if last character of both 
sequences does not match} 


Algorithm: 

LCS(X,Y) 

{ 

m = length[X]; 
n = length[Y]; 
for(i=l ;i<=m;i++) 
c[i,0] = 0; 
for(j =0 ;j <=n ;j ++) 
c[0,j] = 0; 
for(i = l;i<=m;i++) 
for(j=l;j<=n;j++) 

{ 

if(X[i]==Y[j]) 

{ 

c[i](j] = c[i-l][j-l]+l; b[i][j] = “upleft”; 

} 

else if(c[i-l}[]}>= c[i][j-l]) 

{ 

c[i][j] = c[i-l][j]; b[i][j] = “up”; 

} 

else 
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{ 

c[i][j] = c[i][j-l]; b[i][j] = “left”; 

} 

} 

return b and c; 

} 

Analysis: 

The above algorithm can be easily analyzed for running time as O(mn), due to two nested loops. 
The space complexity is O(mn). 


Example: 

Consider the character Sequences X=abbabba Y=aaabba find LCS 


Y 


a 

a 

a 

b 

b 

a 

O 

0 

0 

0 

0 

0 

0 

0 

a 

0 

1 

lupleft 

1 

1 

1 

1 

b 

0 

1 

1 up 

1 

2 

2 

2 

b 

0 

1 

1 up 

1 

2 

3 

3 

a 

0 

1 

2 

2 upleft 

2 

3 

4 

b 

0 

1 

2 

2 

3 upleft 

3 

4 

b 

0 

1 

2 

2 

3 

4 upleft 

4 

a 

0 

1 

2 

3 

3 

4 

5 upleft 


LCS = a a b b a 


Example 2: Given two sequence of characters 
P= <M L N O M> 

Q= <M N O M> find LCS 
Soln: Do itself 

We get LCS=MNM 
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Chapter: 3 

Greedy Paradigm 

Greedy method is the simple straightforward way of algorithm design. The general class 
of problems solved by greedy approach is optimization problems. In this approach the input 
elements are exposed to some constraints to get feasible solution and the feasible solution that 
meets some objective function best among all the solutions is called optimal solution. Greedy 
algorithms always makes optimal choice that is local to generate globally optimal solution 
however, it is not guaranteed that all greedy algorithms yield optimal solution. We generally 
cannot tell whether the given optimization problem is solved by using greedy method or not, but 
most of the problems that can be solved using greedy approach have two parts: 

Greedy choice property 

Globally optimal solution can be obtained by making locally optimal choice and the choice at 
present cannot reflect possible choices at future. 

Optimal substructure 

Optimal substructure is exhibited by a problem if an optimal solution to the problem contains 
optimal solutions to the sub-problems within it. 

To prove that a greedy algorithm is optimal we must show the above two parts are exhibited. For 
this purpose first take globally optimal solution; then show that the greedy choice at the first step 
generates the same but the smaller problem, here greedy choice must be made at first and it should 
be the part of an optimal solution; at last we should be able to use induction to prove that the 
greedy choice at each step is best at each step, this is optimal substructure. 


Fractional Knapsack Problem 

Statement: A thief has a bag or knapsack that can contain maximum weight W of his loot. There 
are n items and the weight of i lh item is wi and it worth vi. Any amount of item can be put into the 
bag i.e. xi fraction of item can be collected, where 0<=xi<=l. Here the objective is to collect the 
items that maximize the total profit earned. 

Here we arrange the items by ratio Vi/wj. 

Algorithm: 

GreedyFracKnapsack (W, n) 

{ 


} 


for(i=l; i<=n; i++) 
x[i] = 0.0; 
tempW = W; 
for(i=l; i<=n; i++) 

{ 

if(w[i] > tempW) then 
break; 
x[i] = 1.0; 
tempW -= w[i]; 

} 

if(i<=n) 

x[i] = tempW/w[i]; 
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Analysis: 

We can see that the above algorithm just contain a single loop i.e. no nested loops the running time 
for above algorithm is O(n). However our requirement is that v[l ... n] and w[l ... n] are sorted, 
so we can use sorting method to sort it in 0(n log n) time such that the complexity of the algorithm 
above including sorting becomes 0(n log n). 


Example: Consider five items along with their respective weights and values, 

I = {11,12,13,14,15} 
w = {5, 10, 20, 30, 40} 
v = {30, 20, 100, 90, 160} 

The knapsack has capacity W=60, then find optimal profit earned by using fractional knapsack. 
Sol": Initially 


Items 

wi 

vi 

11 

5 

30 

12 

10 

20 

13 

20 

100 

14 

30 

90 

15 

40 

160 


Step 2: calculate vi/wi as, 


Items 

wi 

vi 

Pi=vi/wi 

11 

5 

30 

6.0 

12 

10 

20 

2.0 

13 

20 

100 

5.0 

14 

30 

90 

3.0 

15 

40 

160 

4.0 


Step 3: Arranging the items with decreasing order of Pi as, 


Items 

wi 

vi 

Pi=vi/wi 

11 

5 

30 

6.0 

13 

20 

100 

5.0 

15 

40 

160 

4.0 

14 

30 

90 

3.0 

12 

10 

20 

2.0 


Now filling the knapsack according to decreasing value of Pi 


60 < 


20 



Maximum value=vl + v2+new (v3)=30+100+140=270 
40 w = 160 vi 




1 w = 160/40 = 4vi 


35 w = 35 * 4 = 140 vi 
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Huffman Codes: 

Huffman codes are used to compress data by representing each alphabet by unique binary codes in 

an optimal way. As an example consider the file of 100,000 characters with the following 

frequency distribution assuming that there are only 7 characters 

f(a) = 40,000 , f(b) = 20,000 , f(c) = 15,000 , f(d) = 12,000 , f(e) = 8,000 , f(f) = 3,000 , 

f(g) = 2,000. 

Here fixed length code for 7 characters we need 3 bits to represent all characters like 
a = 000 , b = 001 , c = 010 , d = 011 , e = 100 , f = 101 ,g = 110. 

Total number of bits required due to fixed length code is 300,000. 

Greedy strategy: Now consider variable length character so that character with highest frequency 
is given smaller codes like 

C = {a, b, c, d, e, f, g}; f(c) = 40, 20, 15, 12, 8, 3, 2; n = 7 
Initial priority queue is 


e 

8 

d 

12 


15 


20 


40 



i = 3 
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a = 0,b=10,c = 110,d=1110,e=lllll,f = 111101 , g = 111100 
Total number of bits required due to variable length code is 
40,000*1 + 20,000*2 + 15,000*3 + 12,000*4 + 8,000*5 + 3,000*6 + 2,000*6. 
i.e. 243,000 bits 

Here we saved approximately 19% of the space. 
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Analysis 

We can use BuilclHeap(C){se.e notes on sorting} to create a priority queue that takes 0(n) time. 
Inside the for loop the expensive operations can be done in O(logn) time. Since operations inside 
for loop executes for n-1 time total running time of HuffmanAlgo is O(nlogn). 


Job Sequencing with Deadline: 

• A set of njobs, S={al,a2,a3,.,a n } 

• Deadline of jobs={dl,d2,d3,.d n } 

• Profit can be earned if job is completed within their deadline={pl,p2,p3,.,p n } 

Here every job can be completed in unit time (i.e. first job begins at time o and finished at time 1, 
the second job begins at time 1 and finished at time 2 and so on.) and we have a single machine 
(processor). 

The main aim of this problem is to find the feasible sequence of jobs that maximize the profit 
earned. 

Example: lets assume that there are 4 jobs 
n=4 

S=(pi, p2, p 3 , p 4 ) 

D=(di,d 2 , d 3 , d0=(2,1,2,1) 

P=(100,10,15,27) 

Find the sequence due to which maximize the profit. 

Case 1: 

Job pi—feasible—100 profit 
Job p2—not feasible—100 profit 

Job p3—feasible-100+15 profit 

Job p4—not feasible—115 profit 


Case 2: 

Job p2—feasible—10 profit 
Job pi—feasible—10+100 profit 

Job p3—not feasible-100+10 profit 

Job p4—not feasible—110 profit 

In this way we can find various profits by taking all possible combinations of jobs and choose any 
one sequence that produces maximum profit. 

But in this way the time complexity for finding optimal solution is 0(n!) 

Now an alternative way to find the optimal solution with less time is greedy approach. 

According to greedy algorithm for this problem, at first sort the jobs on the basis of profit as, 

S = (pi, p 4 , P3, p 2 ) 

D = (d!,d4, d 3 ,d 2 ) = (2, 1,2, 1) 

P = (100, 27, 15, 10) 


Job 

Feasible/non feasible 

processing 

Sequence 

profit 

Pi 

feasible 

{pl} 

100 

P 4 

feasible 

{pl,p4} 

100+27=127 

p3 

not feasible 

{pl,p4} 

127 

p2 

not feasible 

{pl, p4}2, 3 

127 
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Thus the optimal profit=127 with the processing sequence= {pi, p4} 

Algorithm: 

Assume the jobs are ordered such that p[l]>p[2]>...>p[n] d[i]>=l, l<=i<=n are the deadlines, 
n>=l. The jobs n are ordered such that p[l]>=p[2]>=... >=p[n]. J[i] is the ith job in the optimal 
solution, l<=i<=k. Also, at termination d[J[i]]<=d[J[i+l]], l<=i<k. 

JobSequencing(int d[ ], int j[ ], int n) 

{ 

d[0] = J[0] = 0; // Initialize. 

J[l] = 1;//Include job 1. 
int k=l; 

for (int i=2; i<=n; i++) 

{ 

int r = k; 

while ((d[J[r]] > d[i]) && (d[J[r]{ != r)) 
r--; 

if ((d[J[r]] <=d[i])&& (d[i]>r)) 

{ 

// Insert i into J[], 

for (int q=k; q>=(r+l); q—) 

J[q+l]=J[q]; 

J[r+1] = i; k++; 

} 

} 

return (k); 

} 


Analysis 

For loop executes O(n) line. While loop inside the for loop executes at most times and if the 
condition given inside if statement is true inner for loop executes O(k-r) times. Hence total time 
for each iteration of outer for loop is O(k). Thus time complexity is 0(n ). 
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Unit 3 

Graph Algorithms 

Graph is a collection of vertices or nodes, connected by a collection of edges. Graphs are 
extremely important because they are a very flexible mathematical model for many application 
problems. Basically, any time you have a set of objects, and there is some “connection” or 
“relationship” or “interaction” between pairs of objects, a graph is a good way to model this. 
Examples of graphs in application include communication and transportation networks, VLSI and 
other sorts of logic circuits, surface meshes used for shape description in computer-aided design 
and geographic information systems, precedence constraints in scheduling systems etc. 

A directed graph (or digraph) G = (V,E) consists of a finite set V , called the vertices or nodes, and 
E, a set of ordered pairs, called the edges of G. 

An undirected graph (or graph) G = (V,E) consists of a finite set V of vertices, and a set E of 
unordered pairs of distinct vertices, called the edges. 


Graph Traversals 

There are a number of approaches used for solving problems on graphs. One of the most important 
approaches is based on the notion of systematically visiting all the vertices and edge of a graph. 
The reason for this is that these traversals impose a type of tree structure (or generally a forest) on 
the graph, and trees are usually much easier to reason about than general graphs. 


Breadth-first search 


This is one of the simplest methods of graph searching. Choose some vertex arbitrarily as a root. 
Add all the vertices and edges that are incident in the root. The new vertices added will become the 
vertices at the level 1 of the BFS tree. Form the set of the added vertices of level 1, find other 
vertices, such that they are connected by edges at level 1 vertices. Follow the above step until all 
the vertices are added. 

Algorithm: 

BFS(G,s) //s is start vertex 

{ 

T = {s}; 

L =0; //an empty queue 
Enqueue(L,s); 
while (L != O ) 

{ 

v = dequeue(L); 

for each neighbor w to v 

if ( w ^ L and w ^ T ) 

{ 


} 


enqueue( L,w); 

T = T U {w}; //put edge {v,w} also 


} 
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Example: 

Use breadth first search to find a BFS tree of the following graph. 



Solution: 



Analysis 

From the algorithm above all the vertices are put once in the queue and they are accessed. For each 
accessed vertex from the queue their adjacent vertices are looked for and this can be done in O(n) 
time(for the worst case the graph is complete). This computation for all the possible vertices that 
may be in the queue i.e. n, produce complexity of an algorithm as 0(n 2 ). Also from aggregate 
analysis we can write the complexity as 0(E+V) because inner loop executes E times in total. 

Depth First Search 

This is another technique that can be used to search the graph. Choose a vertex as a root and form a 
path by starting at a root vertex by successively adding vertices and edges. This process is 
continued until no possible path can be formed. If the path contains all the vertices then the tree 
consisting this path is DFS tree. Otherwise, we must add other edges and vertices. For this move 
back from the last vertex that is met in the previous path and find whether it is possible to find new 
path starting from the vertex just met. If there is such a path continue the process above. If this 
cannot be done, move back to another vertex and repeat the process. The whole process is 
continued until all the vertices are met. This method of search is also called backtracking. 


Example: 

Use depth first search to find a spanning tree of the following graph. 
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Solution: 

Choose a as initial vertex then we have 
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Algorithm: 

DFS(G,s) 

{ 

T = {s}; 

Traverse(s); 

} 

Traverse(v) 

{ 

for each w adjacent to v and not yet in T 

{ 

T = T U {w}; //put edge {v,w} also 
Traverse (w); 

} 

} 


Analysis: 

The complexity of the algorithm is greatly affected by Traverse function we can write its running 
time in terms of the relation T(n) = T(n-l) + O(n), here O(n) is for each vertex at most all the 
vertices are checked (for loop). At each recursive call a vertex is decreased. Solving this we can 
find that the complexity of an algorithm is 0(n ). 

Also from aggregate analysis we can write the complexity as 0(E+V) because traverse function is 
invoked V times maximum and for loop executes 0(E) times in total. 


Minimum Spanning Tree 

Given an undirected graph G = (V,E), a subgraph T =(V,E’) of G is a spanning tree if and only if T 
is a tree. The MST is a spanning tree of a connected weighted graph such that the total sum of the 
weights of all edges elE’ is minimum amongst all the sum of edges that would give a spanning 
tree. 

KruskaVs Algorithm: 

The problem of finding MST can be solved by using Kruskal’s algorithm. The idea behind this 
algorithm is that you put the set of edges form the given graph G = (V,E) in nondecreasing order of 
their weights. The selection of each edge in sequence then guarantees that the total cost that would 
from will be the minimum. Note that we have G as a graph, V as a set of n vertices and E as set of 
edges of graph G. 
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Edge with weight 11 forms Edge with weight 15 forms 



The total weight of MST is 64. 

Algorithm: 

KruskalMST(G) 

{ 

T = {V} //forest ofn nodes 

S = set of edges sorted in nondecreasing order of weight 
while (ITI < n-1 and E !-0) 

{ 

Select (u,v)from S in order 

Remove (u,v)from E 

if((u,v) doesnot create a cycle in T)) 

T =Tkj {(u,v)} 
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} 

J 

Analysis: 

In the above algorithm the n tree forest at the beginning takes (V) time, the creation of set 
S takes O(ElogE) time and while loop execute O(n) times and the steps inside the loop 
take almost linear time (see disjoint set operations; find and union). So the total time 
taken is O(ElogE) or asymptotically equivalently O(ElogV)!. 


Prim’s Algorithm 

This is another algorithm for finding MST. The idea behind this algorithm is just take any arbitrary 
vertex and choose the edge with minimum weight incident on the chosen vertex. Add the vertex 
and continue the above process taking all the vertices added. Remember the cycle must be avoided. 


Example: 

Find the minimum spanning tree of the following graph. 



Solution: note: dotted edge is chosen. 



S 


v-s 
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The total weight of MST is 64. 


Algorithm: 

PrimMST(G) 

{ 

T = 0; // T is a set of edges of MST 

S = {s}; //s is randomly chosen vertex and S is set of vertices 
while(S != V) 

{ 

e = (u,v) an edge of minimum weight incident to vertices in T and not forming a 
simple circuit in T if added to T i.e. u e S and ve V-S 
T = Tvj {(u, v)j; 

S = S u jvj; 

} 

} 


Analysis: 

In the above algorithm while loop execute O(V). The edge of minimum weight incident on a 
vertex can be found in 0(E), so the total time is O(EV). We can improve the performance of the 
above algorithm by choosing better data structures as priority queue and normally it will be seen 
that the running time of prim’s algorithm is O(ElogV)!. 
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Shortest Path Problem 

Given a weighted graph G =(V,E), then it has weight for every path p = <vo,vi,...vk> as w(p) = 
w(vo,vi) + w(vi,vi) + ... + w(vk-i,vk). A shortest path from u to v is the path from u to v with 
minimum weight. Shortest path from u to v is denoted by d(u,v). It is important to remember that 
the shortest path may exist in a graph or may not i.e. if there is negative weight cycle then there is 
no shortest path. For e.g the below graph has no shortest path from a to c .You can notice the 
negative weight cycle for path a to b. 



As a matter of fact even the positive weight cycle doesn’t constitute shortest path but there will be 
shortest path. Some of the variations of shortest path problem include: 

Single Source: This type of problem asks us to find the shortest path from the given vertex 
(source) to all other vertices in a connected graph 

Single Destination: This type of problem asks us to find the shortest path to the given vertex 
(destination) from all other vertices in a connected graph. 

Single Pair: This type of problem asks us to find the shortest path from the given vertex (source) 
to another given vertex (destination). 

All Pairs: This type of problem asks us to find the shortest path from the all vertices to all other 
vertices in a connected graph 

Single Source Problem 

Relaxation: Relaxation of an edge (u,v) is a process of testing the total weight of the shortest path 
to v by going through u and if we get the weight less than the previous one then replacing the 
record of previous shortest path by new one. 

Directed Acyclic Graphs (Single Source Shortest paths) 

Recall the definition of DAG, DAG is a directed graph G = (V,E) without a cycle. The algorithm 
that finds the shortest paths in a DAG starts by topologically sorting the DAG for getting the linear 
ordering of the vertices. The next step is to relax the edges as usual. 

Example: 

Find the shortest path from the vertex c to all other vertices in the following DAG. 
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OO 0 OO OO OO oo oo oo 





From (h) (d) and (g) no change. So above is the shortest path tree. 
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Algorithm; 

DagSP(G,w,s) 

{ 

Topologically Sort the vertices ofG 
for each vertex v belongs to V 
do d[v] = go 
d[s] = 0 

for each vertex u, taken in topologically sorted order 
do for each vertex v adjacent to u 
do ifd[v] > d[u] + w(u,v) 

then d[v] = d[u] + w(u,v) 


} 


Dijkstra’s Algorithm 

This is another approach of getting single source shortest paths. In this algorithm it is assumed that 
there is no negative weight edge. Dijkstra’s algorithm works using greedy approach, as we will see 
later. 

Example: 

Find the shortest paths from the source g to all other vertices using Dijkstra’s algorithm. 
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There will be no change for vertices b and d. continue above steps for b and d to 
complete. The tree is shown as dark connection. 


Algorithm: 

Dijkstra(G,w,s) 

{ 

for each vertex ve V 
do d[v] = go 
d[s] = 0 
S = 0 
Q = V 

While(Q!= 0) 

{ 

u = Take minimum from Q and delete. 
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S = S VJ {u} 

for each vertex v adjacent to u 



Analysis: 

In the above algorithm, the first for loop block takes O(V) time. Initialization of priority queue Q 


takes O(V) time. The while loop executes for O(V), where for each execution the block inside the 


loop takes O(V) times . Hence the total running time is 0(V2). 

All Pairs Problem 

As defined in above sections, we can apply single source shortest path algorithms IVI times to 
solve all pair shortest paths problem. 

Flyod’s Warshall Algorithm 

The algorithm being discussed uses dynamic programming approach. The algorithm being 
presented here works even if some of the edges have negative weights. Consider a weighted graph 
G = (V,E) and denote the weight of edge connecting vertices i and j by wij. Let W be the adjacency 
matrix for the given graph G. Let Dk denote an n'n matrix such that Dk(i,j) is defined as the weight 
of the shortest path from the vertex i to vertex j using only vertices from l,2,...,k as intermediate 
vertices in the ath. If we consider shortest path with intermediate vertices as above then computing 
the path contains two cases. Dk(i,j) does not contain k as intermediate vertex and .Dk(i,j) contains k 
as intermediate vertex. Then we have the following relations 
Dk(i,j) = Dk-i(i,j), when k is not an intermediate vertex, and 



D k \i,k) 


D k ''(kj) 


D k (i,j) = D k *'(i,k) + D k l (k,j), when k is an intermediate vertex. 
So from the above relations we obtain: 


D k (ij)= min{D kl (i,j), D k ‘(i.k) + D k '(k,j)). 


The above relation is used by flyod's algorithm to compute all pairs shortest path in 
bottom up manner for finding D 1 , D 2 .D n . 
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Example: 



Solution: 


Adjacency Matrix 


w 

1 

2 

3 

1 

0 

4 

11 

2 

6 

0 

2 

3 

3 

oo 

0 

I) 1 

1 

2 

3 

1 

0 

4 

11 

2 

6 

0 

2 

3 

3 

7 

0 

D 2 

1 

2 

3 

1 

0 

4 

6 

2 

6 

0 

2 

3 

3 

7 

0 

D 3 

1 

2 

3 


1 

0 

4 

6 

2 

5 

0 

2 

3 

3 

7 

0 


Remember we are not showing D k (i,i), since there 
.e. shortest path is zero. 


will be no change 

D (1,2) = 

mi 

= 

mi 

D'(l,3) = 

mi 

= 

mi 

D'(2,1) = 

mi 

= 

mi 

I)'(2.3) = 

mi 

= 

mi 

D'(3,1) = 

mi 

= 

mi 

D'(3^) = 

mi 

= 

mi 

D 2 ( 1,2) = 

mi 

= 

mi 

I) 2 ( 1,3) = 

mi 

= 

mi 

D 2 (2,l) = 

mi 

= 

mi 

D 2 (2,3) = 

mi 

= 

mi 

D 2 (3.1 ) = 

mi 

= 

mi 

D 2 (3,2) = 

mi 

= 

mi 


D°( 1,2). D°( 1,1)+ D°(l,2)} 
4,0+4} = 4 

D°(l,3), D°(l.l)+D°(l,3)} 
11 , 0 + 11 } = 11 
D°(2,l). D°(2,l)+ D°(l,l)} 
6 . 6 + 0 } = 6 
D°<2,3). D°C 
2, 6+ 11} = 2 

D°(3,l), D°(3,1)+ D°( 1,1)} 
3, 3+0} = 3 
3,2).1>°<2 
3+ 4} = 7 


I>°<2,3>. n°<2.1>+D 0 <1.3>} 


I>°<3,2). D°(3,I >+ 1)°( 1,2)} 


D'(1,2). D'(1,2HD'(2,2)} 
4,4 + 0} =4 

l)'(l^), I) , (1.2t+ D't2.3)} 
11.4+2} = 6 

D'(2,l), D'(2,2)+D l (2,l)} 
6 , 0 + 6 } = 6 

D'(2,3), D'(2,2)+D'(2.3)} 

2 , 0 + 2 } = 2 

I>'<3.1>. D'(3.2>+ 1)‘(2,1)} 
3, 7+ 6} = 3 

D'(3,2). D'(3,2)+D'(2,2)} 
7.7+01 =7 


I) 3 < 1.2> = min(I) 2 < 1,2). I) 2 (1,3)+ D 2 <3.2)} 
= min{4. 6 + 7) = 4 

D 3 ( 1,3) = min{D 2 (l,3), D 2 (1.3)+ D 2 (3,3)} 

= min{6, 6+0} = 6 

D 3 (2.1) = min{D 2 (2.1), D 2 <2.3>+ I> 2 <3.1 >} 

= min{6. 2+ 3} = 5 

D 3 (2,3) = min{D 2 (2.3), D 2 (2,3)+ D 2 (3,3)} 

= min {2, 2+0} = 2 

D 3 (3,l) = min{D 2 (3,l), D 2 (3,3)+D 2 (3,l)} 

= min{3,0+ 3} = 3 

D 3 (3,2) = min{D 2 (3,2), D 2 (3,3)+ D 2 (3,2)} 

= min{7,0+7} = 7 
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A IgorM lim: 

FIoydWarshalA PSPfV/, D, ni // W is adjacency matrix of graph G. 

/ 

for(i= l ;ic =n 

for(j= I „ 7 <= / ;j+ +) 

D[iJ[jJ = Wf il{}}; *f initially Dfjfj is DP. 

For( k= I ; Jt<= n;k + +) 
for( i= I ; i<;=n; i+ +J 
for(j= I ;j< =1 ;j++) 

DfiJfjJ = minlDlijUl D[i}[k}+ D[k][jj}; //D[}U’ S are & s. 

/ 

Analysis: 

Clearly the above algorithm’s running time is 0(n3), where n is cardinality of set V of vertices. 

Exercises 


1. Write an algorithm for Topological sorting the directed graph. 

2. Explore the applications of DFS and BFS. Describe the biconnected component 
and algorithm for its detection in a graph 

3. Give an example graph where bellman ford algorithm returns FALSE, Justify for 
the falsity of the return value 

4. Give an example graph where Dijkstra’s algorithm fails to work. Why the found 
graph does not work, give reason? 

5. Maximum Spanning Tree of a weighted Graph G = (V.E) is a subgraph T = 
(V,E*) of G such that T is a tree and the sum of weights of all the edges of E* is 
maximum among all possible set of edges that would form a spanning tree. 

Modify the prim's algorithm to solve for maximum spanning tree. 
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Chapter 2: 


[Geometric Algorithms] 


Computational Geometry 

The field of computational geometry deals with the study of geometric problems. In our 
class we present few geometric problems for e.g. detecting the intersection between line segments, 
and try to solve them by using known algorithms. In this lecture, we discuss and present 
algorithms on context of 2-D. 

Application Domains 


o Computer graphics 
o Robotics 
o GIS 

o CAD/CAM - IC Design, automobile, buildings, 
o Molecular Modeling 
o Pattern recognition 


Some Definitions 

Point: 

A point is a pair of numbers. The numbers are real numbers, but in our usual calculation we 
concentrate on integers. For e.g. pi(xi,yi) and p 2 (x 2 ,y 2 ) are two points as shown 

Line segment: 

A line segment is a pair of points pi and p 2 , where two points are end points of the segment. 
For e.g. S(pi,p 2 ) is shown below. 



Pi(xi,yi) 


Ray: - 


A ray is an infinite one dimensional subset of a line determined by two points: say PO, PI, 
where one point is denoted as the endpoint. 

Thus, a ray consists of a bounded point & is extended to infinitely along a line segment. 



""Pi 

Point on the way of ray's direction 


Po 


Line: - Line is represented by a pair of points PO and PI say, which is extended in both way 

to infinity along the segment represented by the pair of points PO & PI. 
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Line: - Line is represented by a pair of points PO and PI say, which is extended in both way to 
infinity along the segment represented by the pair of points PO & PI. 


► 


Po 


Pi 


Polygon: 


Simply polygon is a homeomorphic image of a circle, i.e. it is a certain deformation of 

circle 



Simple Polygon: 

A simple polygon is a region of plane bounded by a finite collection of line segments to form a 


simple closed curve. Mathematically, let VO, VI, V2,-, Vn-1 are n ordered vertices in the 

plane, 


then the line segments eO (VO, VI), el (VI, V2),., en-1 (Vn-1, VO) form a simple polygon if 

and only if; 


fl the intersection of each pair of segments adjacent in cyclic ordering is a simple single 
point shared by them; 

ei fl ei+1 = Vi+1 & 

0 non-adjacent segments do not intersect; 
ei fl ej = O 

Thus, a polygon is simple if there are no points between non-consecutive linesegments, 
i.e. vertices are only intersection points. Vertices of simple polygon 
are assumed to be ordered into counterclockwise direction 



Non-Simple poly son (Self Intersecting) 

A polygon is non-simple if there is no single interior region, i.e. non-adjacent edges intersect each 

other. 
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Convex Polygon: 

A simple polygon P is convex if and only if for any pair of points x, y in P the line segment 
between x and y lies entirely in P. We can notice that if all the interior angle is less than 180°, then 
the simple polygon is a convex polygon. 



Convex Polygon 


Non Convex (Concave) 


Diagonal of a simple polygon; 

A diagonal of a simple polygon is a line segments connecting two non-adjacent vertices and 
lies completely inside the polygon. 



Here all (V2, Vs), (V3, V7), (V4, V6) & (V10, V12) are diagonals of the polygon but (V9, Vi 1) is not a 
diagonal 

Ear of Polygon: 

Three consecutive vertices Vi, Vi+i, Vi+ 2 of a polygon form an ear if (Vi, Vi+2) is a diagonal, Vi+i is the 
tip of the ear. 



(Vi, Vi, V3) is an ear. 

Blit. (Vo. Vi, V2) is not an ear 
(V-. Vo, Vi) & (Vj. Vi. V3) are non-overlapping ear; 
(V1, V:, V3) & (V3. V4, Vs) are non-overlapping ears 
(V3, V4, V5) & (V4, V5, Vg) are overlapping ears. 
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Mouth: 

Three consecutive vertices Vi, Vi+i, V1+2 of a polygon form a mouth if (Vi, Vi+2) is an external 
diagonal. In above figure, (Vo, Vi, V2) & (V2, V3, V4) are mouths of the polygon. 

One-Mouth Theorem 

Except for convex polygons, every simple polygon has at least one mouth. 

Two-Ears Theorem 

Every polygon of n > 4 vertices has at least two non-overlapping ears. 

Notion of Left Turn & Right Turns: 

Left Turn: 

For three points Po, Pi, P2 in a place, Po, Pi, P2 is said to be left turn if line segment (Pi, P2) lies to the 
left of line segment (Po, Pi). 



Po (xo. yo) 


Right Turn: 

If P line segment (Pi, P2) lies to the right of (Po, Pi) then Po, Pi, P 2 is a right turn. 



So. right turn P.3P1P: 


Computing point of intersection between two line segments 

We can apply our coordinate geometry method for finding the point of intersection between 
two line segments. Let Si and S 2 be any two line segments. The following steps are used to 
calculate point of intersection between two line segments. We are not considering parallel line 
segments here in this discussion. 

• Determine the equations of line through the line segment Si and S 2 . Say the equations are Li = (y 
= mix + ci) and L 2 = (y = rmx + C2) respectively. We can find the equation of line Li using the 
formula of slope (mi) = (y 2 -yi)/ (x 2 -xi), where (xi,yi) and (x 2 ,y 2 ) are two given end points of the 
line segment Si. Similarly we can find the m 2 for L 2 also. The values of ci’s can be obtained by 
using the point of the line segment on the obtained equation after getting slope of the respective 
lines. 
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• Solve two equations of lines Li and L2, let the value obtained by solving be p = (xi, yi). Here we 
confront with two cases. The first case is, if p is the intersection of two line segments then p lies on 
both SI and S2. The second case is if p is not an intersection point then p does not lie on at least 
one of the line segments SI and S2. 

The figure below shows both the cases. 




Segments do not intersect 


Segments intersect 


Detecting point of intersection 

In straightforward manner we can compute the point of intersection (p) between the lines 
passing through S 1 and S2 and see whether the line segments intersects or not as done in above 
discussion. However, the above method uses the division in the computation and we know that 
division is costly process. Here we try to detect the intersection without using division. 

Left and Right Turn: Given points po(xo,yo), pi(xi,yi), and p2(x2,y2). If we try to find whether the 
path popip2 make left or right turn, we check whether the vector popi is clockwise or 
counterclockwise with respect to vector pop2. We compute the cross product 
of the vectors given by two line segments as 

(pi- po)x(p2- po) = (xi- xo, yi- yo) x(x2- xo, y2- yo) = (xi- xo)(y2- yo)-(yi- yo) (x2- xo), this can 
be represented as 

Here we have. 

1 1 

• If A = 0 then po.p1.p2 are collinear 

- v i x 2 

• If A > 0 then P0P1P2 make left turn Le. there is left turn at pi. 

Vj y 2 

(p 0 pi is clockwise with respect to p 0 p2). 

• If A < 0 then popip: make right turn i.e. there is right turn at pi. 

(popi is anticlockwise with respect to popH- 

See figure below to have idea on left and right turn as well as direction of points. The 
cross product’s geometric interpretation is also shown below. 


A = 


->0 
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Area of a parallelogram 
spanned by two vectors is 
given by their cross 
product. 



^ Right turn 
, Anticlockwise 


Fig: Cross product geometrical interpretation. 



Pi* Clockwise 


Left turn 


Po Fig: Right turn at pi 




Fig: Left turn at pi 


Using the concept of left and right turn we can detect the intersection between the two line 
segments in very efficient manner. 

Convex hull: 

Definition: 

- The convex hull of a finite set of points, S in plane is the smallest convex polygon P, that 
encloses S. (Smallest area) 

- The convex hull of a set of points, S in the plane is the union of all the triangles determined by 
points in S. 

- The convex hull of a finite set of points, S, is the intersection of all the convex polygons 
(sets) that contain S. 

There are wide ranges of application areas where it comes use of convex hulls such as; 

- In pattern recognition, an unknown shape may be represented by its convex hull, which is then 
matched to a database of known shapes. 

- In motion planning, if the robot is approximated by its convex hull, then it is easier to plan 
collision free path on the landscape of obstacles. 

- Smallest box, fitting ranges & so on. 


Graham’s Scan Algorithm: 

This algorithm computes convex hull of points by maintaining the feasible candidate points on the 
stack. If the candidate point is not extreme, then it is removed from the stack. When all points are 
examined, only the extreme points remain on the stack & which will result the final hull. 

Input =4 P = {po, pi,-, p n -i} of n-points. 

Output => Convex hull of P 
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Algorithm: 

-Find a point pi with lowest y-coordinate, let it be qO. 

- Sort the input points angularly about qO let the sorted list is now {qO, ql,-, qn- 1 } 

- Push qO into stack S and push ql into the stack S. 

- Initialize i = 2 

while (i < n) 

if LeftTum (next top (S), top (S), qi) is true 
push qi into stack S. 
i++ 

else 

pop the stack 

end if 
end while 

- Each points popped from stack are not vertex of convex hull. 

- Finally, at last when all elements are processed, the points that remain on stack are the vertices of 
the convex hull. 

Complexity Analysis: 

- Finding minimum y-coordinate point it takes O (n) time. 

- Sorting angularly about the point takes O (nlogn) time. 

- Pushing & popping takes constant time. 

- The while loop runs for O (n) times 

Hence, the complexity = O (n) + O (nlogn) + O ( 1 ) + O (n) 

= O (nlogn). 

Another way of constructin this Algorithm: 

GrahamScan(P) //P = {pi, pi,..., p,i} 

{ 

po = point with lowest y-coordinate value. 

Angularly sort the other points with respect to po. Let q = {qi,qi,..., qm} be sorted points. 
Push(S, po); // S is a stack 
Push(S, qi); 

Push(S, qi); 

For(i= 3 ;i<m;i++) 

{ 

a = NexttoTop(S); 
b= Top(S); 

while (a,b,qnnakes non left turn) 

Pop(S); 

Push(S,qi); 

} 

return S; 

J 
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C ousidei an example: 


q4 qs 



(1) At First. Push qo & qi into stack 




qi 

qo 


(2) Scan q 2 : 


qoQiQ? is left turn so push q 2 into stack. => 



qi 

qo 

(3) Scan q 3 : 

qiq 2 q 3 is not left turn so pop (stack) i.e. pop q 2 => 



qi 

qo 

qoqiqj is left turn so push q 3 into stack. =► 


q3 

qi 

qo 


& so on. 

At last, final hull will be => {qo- qi. q 3 . q.»- q<s} 


Exercises: 

1 . Write down the algorithm to find the point of intersection of two linesegments, if exists. 

2 . Use Graham’s scan algorithm to find convex hull of the set of points below. 

• P L1 


P 12 


PlL 


Pu 


P a 


P? 


PL 


PlQ 


P* 


Pfc - P 5 


# * # 

P2 P3 Pi 
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Example 2 : 

Find the convex hull of the set of points given below using graham’s scan algorithm. 




Pa 

* 


Pfi* 



* 


P 5 



*P 2 


•Pi 


Solution: 



% * 

qn* #qi qj 



qi 



qi 


% 


q- 

* 
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Chapter 3 

[NP Complete Problems & Approximation Algorithms! 

Up to now we were considering on the problems that can be solved by algorithms in worst- 
case polynomial time. There are many problems and it is not necessary that all the problems have 
the apparent solution. This concept, somehow, can be applied in solving the problem using the 
computers. The computer can solve: some problems in limited time e.g. sorting, some problems 
requires unmanageable amount of time e.g. Hamiltonian cycles, and some problems cannot be 
solved e.g. Halting Problem. In this section we concentrate on the specific class of problems called 
NP complete problems (will be defined later). 

Tractable and Intractable Problems: 

We call problems as tractable or easy, if the problem can be solved using polynomial time 
algorithms. The problems that cannot be solved in polynomial time but requires superpolynomial 
time algorithm are called intractable or hard problems. There are many problems for which no 
algorithm with running time better than exponential time is known some of them are, traveling 
salesman problem, Hamiltonian cycles, and circuit satisfiability, etc. 

P and NP classes and NP completeness: 

The set of problems that can be solved using polynomial time algorithm is regarded asclass 
P. The problems that are verifiable in polynomial time constitute the class NP. The class of NP 
complete problems consists of those problems that are NP as well as they are as hard as any 
problem in NP (more on this later). The main concern of studying NP completeness is to 
understand how hard the problem is. So if we can find some problem as NP complete then we try 
to solve the problem using methods like approximation, rather than searching for the faster 
algorithm for solving the problem exactly. 


Problems: 

Abstract Problems: 

Abstract problem A is binary relation on set I of problem instances, and the set S of 
problem solutions. For e.g. Minimum spanning tree of a graph G can be viewed as a pair of the 
given graph G and MST graph T. 


Decision Problems: 

Decision problem D is a problem that has an answer as either “true”, “yes”, “1” or “false”, 
”no”, “0”. For e.g. if we have the abstract shortest path with instances of the problem and the 
solution set as {0,1}, then we can transform that abstract problem by reformulating the problem as 
“Is there a path from u to v with at most k edges”. In this situation the answer is either yes or no. 


Optimization Problems: 

We encounter many problems where there are many feasible solutions and our aim is to 
find the feasible solution with the best value. This kind of problem is called optimization problem. 
For e.g. given the graph G, and the vertices u and v find the shortest path from u to v with 
minimum number of edges. The NP completeness does not directly deal with optimizations 
problems, however we can translate the optimization problem to the decision problem. 
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Encoding: 

Encoding of a set S is a function e from S to the set of binary strings. With the help of encoding, 
we define concrete problem as a problem with problem instances as the set of binary strings i.e. if 
we encode the abstract problem, then the resulting encoded problem is concrete problem. So, 
encoding as a concrete problem assures that every encoded problem can be regarded as a language 
i.e. subset of {0,1 }*. 

Complexity Class P: 

Complexity class P is the set of concrete decision problems that are polynomial time 
solvable by deterministic algorithm. If we have an abstract decision problem A with instance set I 
mapping the set {0,1}, an encoding e: I—>{0,1}* is used to denote the concrete decision problem 
e(A). We have the solutions to both the abstract problem instance iel and concrete problem 
instance e(i) e{0,l}* as A(i)e{0,l}. It is important to understand that the encoding mechanism 
does greatly vary the running time of the algorithm for e.g. take some algorithm that runs in O(n) 
time, where the n is size of the input. Say if the input is just a natural number k, then its unary 
encoding makes the size of the input as k bits as k number of l’s and hence the order of the 
algorithm’s running time is O(k). In other situation if we encode the natural number k as binary 
encoding then we can represent the number k with just logk bits (try to represent with 0 and lonly) 
here the algorithm runs in O(n) time. We can notice that if n = logk then O(k) becomes 0(2n) with 
unary encoding. However in our discussion we try to discard the encoding like unary such that 
there is not much difference in complexity. 

We define polynomial time computable function f: {0,1} *—> {0,1 }* with respect to some 
polynomial time algorithm PA such that given any input x e{0,l}*, results in output f(x). For 
some set I of problem instances two encoding ei and C2 are polynomially related if there are two 
polynomial time computable functions f and g such that for any i el, both f(ei(i)) = e 2 (i) and 
g(e 2 (i)) = ei(i) are true i.e. both the encoding should computed from one encoding to another 
encoding in polynomial time by some algorithm. 

Polynomial time reduction: 

Given two decision problems A and B, a polynomial time reduction from A to B is a 
polynomial time function f that transforms the instances of A into instances of B such that the 
output of algorithm for the problem A on input instance x must be same as the output of the 
algorithm for the problem B on input instance f(x) as shown in the figure below. If there is 
polynomial time computable function f such that it is possible to reduce A to B, then it is denoted 
as A < P B. The function f described above is called reduction function and the algorithm for 
computing f is called reduction algorithm. 
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Complexity Class NP: 

NP is the set of decision problems solvable by nondeterministic algorithms in polynomial 
time. When we have a problem, it is generally much easier to verify that a given value is solution 
to the problem rather than calculating the solution of the problem. Using the above idea we say the 
problem is in class NP (nondeterministic polynomial time) if there is an algorithm for the problem 
that verifies the problem in polynomial time. V is the verification algorithm to the decision 
problem D if V takes input string x as an instance of the problem D and another binary string y, 
certificate, whose size is no more than the polynomial in the size of x. the algorithm V verifies an 
input x if there is a certificate y such that answer of D to the input x with certificate y is yes. For 
e.g. Circuit satisfiability problem (SAT) is the question “Given a Boolean combinational circuit, is 
it satisfiable? i.e. does the circuit has assignment sequence of truth values that produces the output 
of the circuit as 1? ” Given the circuit satisfiability problem take a circuit x and a certificate y with 
the set ofvcdues that produce output 1, we can verify that whether the given certificate satisfies the 
circuit in polynomial time. So we can say that circuit satisfiability problem is NP. We can always 
say P I NP, since if we have the problem for which the polynomial time algorithm exists to solve 
(decide: notice the difference between decide and accept) the problem, then we can always get the 
verification algorithm that neglects the certificate and accepts the output of the polynomial time 
algorithm. From the above fact we are clear that P I NP but the question, whether P = NP remains 
unsolved and is still the big question in theoretical computer science. Most of the computer 
scientists, however, believes that P 1 NP. 


NP-Comyleteness: 

NP complete problems are those problems that are hardest problems in class NP. We define 
some problem say A, is NP-complete if 

1. A e NP, and 

2. B < P A, for every B e NP. 

We call the problem (or language) A satisfying property 2 is called NP-hard. 

Cook’s Theorem: 

“ SAT is NP-hard" 

Proof: (This is not actual proof as given by cook, this is just a sketch) 

Take a problem V □ NP, let A be the algorithm that verifies V in polynomial time (this must be 
true since V □ NP). We can program A on a computer and therefore there exists a (huge) logical 
circuit whose input wires correspond to bits of the inputs x and y of A and which outputs 1 
precisely when A(x,y) returns yes. 

For any instance x of V let Ax be the circuit obtained from A by setting the x-input wire values 
according to the specific string x. The construction of Ax from x is our reduction function. If x is a 
yes instance of V, then the certificate y for x gives satisfying assignments for Ax. Conversely, if Ax 
outputs 1 for some assignments to its input wires, that assignment translates into a certificate for x. 

Theorem 2 : (Cook’s Theorem) 

“ SAT is NP-complete” 

Proof: 

To show that SAT is NP-complete we have to show two properties as given by the definition of 
NP-complete problems. The first property i.e. SAT is in NP we showed above (see pg 5 italicized 
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part), so it is sufficient to show the second property holds for SAT. The proof for the second 
property i.e. SAT is NP-hard is from lemma 3. This completes the proof. 

Approximation Algorithms: 

An approximate algorithm is a way of dealing with NP-completeness for optimization 
problem. This technique does not guarantee the best solution. The goal of an approximation 
algorithm is to come as close as possible to the optimum value in a reasonable amount of time 
which is at most polynomial time. If we are dealing with optimization problem (maximization or 
minimization) with feasible solution having positive cost then it is worthy to look at approximate 
algorithm for near optimal solution. 

An algorithm has an approximate ratio of p(n) if, for any problem of input size n, the cost C of 
solution by an algorithm and the cost C* of optimal solution have the relation as max(C/C*,C*,C) 
< p(n). Such an algorithm is called p(n)-approximation algorithm. 

The relation applies for both maximization (0 < C < C*) and minimization (0 < C* < C) problems. 
p(n) is always greater than or equal to 1. If solution produced by approximation algorithm is true 
optimal solution then clearly we have p(n) = 1. 

Vertex Cover Problem: 

A vertex cover of an undirected graph G =(V,E) is a subset V' cV such that for all 
edges (u,v) eE either ue V’ or ve V' or u and v e V\ The problem here is to find the 
vertex cover of minimum size in a given graph G. Optimal vertex-cover is the 
optimization version of an NP-complete problem but it is not too hard to find a vertex- 
cover that is near optimal. 

Algorithm: 

ApproxVertexCover (G) 

t 

C {}; 

E’ = E 

while E' is not empty 

do Let (u, v) be an arbitrary edge of E' 

C = C ~ {u, v} 

Remove from E' every edge incident on either u or v 
return C 

} 

Example: (vertex cover running example for graph below) 
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Solution: 






Optimal vertex cover as lightly shaded vertices 


Analysis: 

If E’ is represented using the adjacency lists the above algorithm takes O (V+E) since each edge is 
processed only once and every vertex is processed only once throughout the hole operation. 
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Tribhuvan University 
Bachelor of Science in Computer Science 
And Information Technology 
Examination, 2069 
New Summit College 

(Old Baneshowr, Kathmandu) 

Subject: “Design and Analysis of Algorithm(DAA)” FM:80 

Time: 3 hr. PM: 32 

Attempt all the Questions {10 x 8 =80} 

1. Why asymptotic notations are important in algorithm analysis? Describe big-O, big- li Land big- 
theta notation with suitable examples. {2 + 2 + 2+2} 

2. What is recurrence relation? Prove that the complexity of the recurrence relation “T(n) = 

8T(n/2) + m” is O(m) by using substitution method. {1+7} 

3. Given the following block of code, write a recurrence relation for it and also find asymptotic 
upper bound (Assume that all dotted code takes constant time) { 4+4) 

Fun(int n) 

{ 


if(conditionl) 

x=Fun(n/2) 
else if(condition2) 
x=Fun(2n/3) 

else 

x= Fun(n/4) 


} 

4. What is the concept behind randomized quick sort? Write down its algorithm and give its 
average case analysis. {1+3+4) 

5. What is meant by medial order statistics? Write the algorithm for expected liner time selection 
and analyze it. {1 +3+4} 

6. Devise a divide and conquer algorithm for finding minimum and maximum element among a 
set of given elements. Write recurrence relation for your algorithm and give its big-0 estimate. 
{5+3} 


7. What are the characteristics of problem that can be solved by using dynamic programming 
algorithm? Give the recursive definition of solving 0/1 knapsack problem. Trace the algorithm 
for w={3,4,2,2,3}, v={ 12,14,6,5,6} and knapsack of capacity 12. (2+1+5) 

8. Write the recurrence relation for Longest Common subsequence problem(LCS). Trace the 
algorithm to find LCS of X={a,b,c,b,d,a,b} and Y={b,d,c,a,b,a}. (2+6) 


9. Use master method to find the big-0 estimates of the recurrences: {4+4) 
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a. T(n) = 3T(n/2) + n 

b. T(n) = 4T(n/2) + m 

10 Show all the steps required for sorting an array of size 10 by using Heap sort. 
a[10]={5, 3, 2, 4, 7, 8, 1,11, 9, 15}. (8) 

Hint: At first construct a heap and then sort by using Heap sort properties. 

A Complete Note in Design And Analysis of Algorithms 



Email: Saud.bhupendra427@gmail.com 
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